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When the psychologist offers a device for measuring the effect- 
iveness of advertisements, such as the Order of Merit Method, or 
one of the Attention or Memory methods, it should be evaluated 
not in comparison with some ideal method but rather with the 
other actual methods in use with which it would have to com- 
pete. In investigating current methods of measuring advertising 
efficiency the writer was surprised to discover the reliance which 
is at present placed upon the return coupon as a measure of the 
value not only of a particular piece of advertising copy, but of 
an advertising medium which carries the copy. This reliance 
seems not to be limited to cases of strictly mail order selling, 
but is extended to cases where the product may be purchased 
directly in any locality, and where the return coupon concerns 
merely samples or descriptive literature. Publications have been 
known to lose valuable clients because the number of “replies” 
through that medium do not come up to expectations. There 
must be many factors determining whether readers of a particular 
medium will or will not indicate their interest in advertised 
articles by filling in and mailing a return coupon. The price of 
a magazine as well as the nature of its contents, determining the 
character of its readers; its editorial policy such as that of ad- 
vocating buying from local dealers; the distribution of readers 
in city and country; all may give clues to the possible attitude 
of the reader toward the return coupon. In addition to these, 
there are other more strictly psychological! factors involved. 
These, like most others which determine the human reaction to 
advertising, can be discovered by painstaking research. This 
report deals with conditions that are not peculiar to any one 
medium, but rather with the effectiveness of the return coupon 
in general. Two of these will be discussed, namely (1) the dc- 
pendence of effectiveness upon attention value, and ‘2) the de- 
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pendence of effectiveness upon ease of using the coupon. 

1. A casual survey of any magazine will provide abundant 
evidence of the lack of attention-getting power in the return 
coupon. Its size is usually reduced to a minimum; its position 
is at the extreme bottom of the advertisement or in a lower 
corner, with none of the customary mechanical devices employed 
to carry the eye to it; its type is small and plain. It would seem 
that a reader might well have his interest aroused in the article 
advertised and yet fail to be stirred to action by the appeal of 
the coupon, or worse still might fail to notice the coupon at all. 

To test the attention value of return coupons in comparison 
with the advertisements that carry them, a simple experiment 
was arranged. Six full page advertisements (black and white 
only) were taken from the October 14 issue of the Saturday 
Evening Post. Three of these contained return coupons of the 
usual sort, while the other three had none. This set of six 
advertisements was presented to 210 persons each of whom was 
asked to examine them as much as he liked, but more carefully 
than he would in looking over a magazine. Immediately after 
this examination was finished, the advertisements were withdrawn 
and each person was asked to write the names of the articles 
advertised. The test was made extremely easy by using only six 
advertisements and testing for knowledge of them immediately 
after so that as large a proportion of the people tested as pos- 
sible should remember all of them. That most of the persons 
did remember all the advertisements may be seen in column 3 
of the accompanying table, where 189 of the total of 210 re- 
membered the advertisement making the lowest score and 202 
of the total of 210 remembered the one making the highest score. 
It will be noticed that the six advertisements were nearly equally 
well remembered. 

As soon as the names of the articles advertised had been 
written, these sheets were collected, and others containing a 
printed list of these articles were given to each person with in- 
structions to check those articles whose advertisements had re- 
turn coupons. The results of this test are given in the table. 
The first column gives the name of the article advertised, columns 
2, 3 and 4 give the number of persons who recalled each ad- 
vertisement. The scores for men and women are reported separ- 
ately in columns 2 and 3 and are then combined in column 4. 
Columns 5, 6 and 7 give the data for the recall of the coupons, 
5 and 6 giving the records separately for men and women and 
7 combining these into a total score. The first three advertise- 
ments in the table contained return coupons while the last three 
contained no coupons.‘ The lower right hand corner of the table— 
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figures followed by an (*)—represent the cases where ad- 
vertisements were said to contain coupons when they really did 
not. Disregarding the records for men and women separately, 
we find that 30, 55 and 31 persons reported falsely concerning 
the presence of coupons. These figures indicate the degree to 
which guessing occurred, and furnish the means for calculating 
approximately how much guessing occured in the case of ad- 
vertisements which really did have coupons. This may be done 
roughly as follows: It will be observed that the last three ad- 
vertisements were recalled by 592 persons and of these 116 or 














NUMBER OF RECALLED RECALLED 
PERSONS WHO ADVERTISEMENTS COUPONS 
ADVERTISEMENTS WITH COUPONS 
(1) (2) (3) (4) (5) (6) (7) 
Articles Advertised Men Women Both Men Women Both 
Fleischman’s Yeast 81 109 190 40 68 108 
Booth’s Sardines 82 110 192 18 31 49 
Underwood Bookkeeping 
Machine 83 106 189 22 33 55 
Total 246 325 571 80 132 212 
ADVERTISEMENTS WITHOUT COUPONS 
Styleplus Clothes 88 111 199 14* 16° 30° 
Squibb’s Products 86 105 191 25* 30* 55* 
Apollo Chocolates 91 111 202 12* 19* 31* 
Total 265 327 592 51* 65* 116* 


about 20 per cent said they contained coupons when they did 
not. The first three advertisements were recalled by 571 persons 
and of these 212 or about 40 per cent said they contained cou- 
pons. But some of these persons must have been guessing and 
we may assume that about the same proportion guessed here as 
in the case of the last three advertisements, namely, 20 per cent. 
Making this correction, it would appear that only about 20 per 
cent of the people who recalled the advertisements recalled whether 
they had return coupons. As our correction for guessing is 
conservative, this 20 per cent is more likely to be too high than 
too low. 

Since the ultimate measure of the value of an advertisement 
is the number of purchasers it creates, the return coupon as an 
indicator of effectiveness should register more than 20 per cent 
of the attention and memory value of the whole advertisement. 
By giving it some of the attention power devoted to the rest of 
the advertisement, it might be made to register more nearly the 
value of the advertisement or of the medium in which it is placed. 
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Certainly if much reliance is to be placed upon the coupon to 
indicate these values some effort should be made to bring it to 
the attention of the reader. 


The figures in the table showing the records for men and women 
separately both in the recall of advertisements and in the recall 
of coupons show only one difference worth noting. In every case 
the scores for women are higher than for men, a difference that 
is usually noted in tests of this sort. But the proportion of 
coupons recalled to advertisements recalled and the proportion 
of guesses does not differ materially for the two sexes. Likewise, 
no particular significance is to be attached to the difference in 
scores of the different advertisements: They were too few in 
number to enable conclusions to be drawn about the value of 
types of coupons etc. The high coupon score for the Fleischman 
advertisement is probably due to the large element of guessing 
in that case, but this cannot be determined definitely from our 
records. The influence of difference in size, position and layout 
of coupons is a matter which would be well worth investigating. 


®. The second condition upon which the effectiveness of the 


return coupon depends is the ease with which it may be used. 
There are a number of factors that might well be considered here. 
For example, the paper on which the advertisement is printed 
may be too smooth to permit of pencil writing and too porous to 
permit the use of ink without its blotting. Again, if not perforated, 
the coupon may tear in being removed from the advertisement. 
These are difficulties probably inherit in the medium and which 
cannot be remedied. But there is another, which can be more 
easily handled. This has to do with the amount of space allowed 
for writing the name and address on the coupon. A glance at a 
collection of coupons will show that many of them are entirely 
too short to receive an ordinary signature. Now, it may seem at 
first a very small matter to expect the reader to reduce his hand- 
writing sufficiently to meet the coupon conditions. Still, it mus 
be remembered that in many cases, if not in the majority of 
them, the decision to react or not to react by filling in and mailing 
a coupon hangs on a very slender thread. The resistance which 
any slight obstacle offers may be just enough to sway the re-action 
in the negative direction. 

A series of measurements was made of a random collection of 
110 signatures, and a series of 435 return coupons, some of them 
from full page and some frem half page advertisements in large 
size periodicals such as The Ladies Home Journal and the Sat- 
urday Evening Post; others were taken from full page and half 
page advertisements in standard size magazines such as Scribners 
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and the Century Mdgazine.- The spaces allowed for both the 
signature and the address were measured but only the former will 
be discussed. The accompanying chart will show graphically the 
degree to which the spaces fall short of accommodating the sig- 
natures. The horizontal scale for every curye represents the 
length of the signatures and the signature spaces in centimeters, 
and the vertical scale represents the number of cases of a given 
length. The topmost curve shows the distribution of signature 
lengths, ranging from about 2.5 cm., to ‘about 11.5 cm., with the 
median length at 6.9 em. ‘The second curve shows the distribu- 
tion of signature ‘spaces in coupons from full page advertise- 
ments in large size n.. zazines; the third, in coupons from half 
page advertisements in large size’ magazines; the fourth in cou- 
pons from full page advertisements in standard magazines; and 
the fifth in coupons from half page advertisements in standard 
size magazines. The solid line representing the median length 
of the signatures is carried down through the other curves, whose 
own medians are represented by dotted lines. Although there 
are a few coupons with enough space.to accomodate the largest 
names, the median of the names is considerably larger than the 
medians of any of the spaces. . The relationship may also be 
expressed in terms of the per cent of the signature spaces that 
equal or exceed in length the median of the signatures. These 
figures are given below. 


Median % Reaching 


Length or Exceed. 
No, Cases in Cm. Med. Name 
Signatures 109 6.9 20 
Full page Ad. Large Size 133 5.9 20 
Half page Ad. Large Size 55 4.9 20 
Full page Ad. Standard Size 116 6.0 20 
Half page Ad. Standard Size 22 5.3 0 (Approx) 


If the distributions of the spaces were equal to those of the 
signatures, there should be 50 per cent of the cases of the former 
exceeding the median of the latter. Thus in three of the cases, 
30 per cent of the names are too large for the spaces, and ‘in the 
fourth case, they are about all too large to go in the space pro- 
vided. There appears to be an even greater discrepancy between 
the height of writing space provided and the height of space 
needed for the signatures, although no actual measurements have 
been made to support this statement. 

That the space allowed for coupon purposes is not limited by 
the size of the advertisement is clear from the fact that the coupon 
in the standard magazines tended to be larger than that in the 
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large size magazine, both for the full and half page advertise- 
ments (medians of 6.0 and 5.3 compared with medians of 5.9 and 
4.9). The real reason is to be sought in the failure to recognize 
the importance of providing adequate space. Adults have their 
writing habits firmly established, so much so that writing a signa- 
ture or an address is quite automatic. Interference with the 
smooth flow of the automatic responses when once begun, arouses 
resistance and an unpleasant feeling-tone. This is illustrated in 
the well-known Downey tests for character traits in which the 
subject is asked to reduce, enlarge or otherwise alter the character 
of his writing, and in which the measure is in terms of resistance 
encountered in complying with the instructions. Now, it is just 
this unpleasant feeling-tone which the advertiser tries in so many 
other ways to avoid, e.g. by the use of beautiful illustrations, 
beautiful color combinations, graceful border treatments, appro- 
priate type faces, etc. The coupon, if it is to be relied upon or 
used at all, should certainly not be permitted to defeat its own pur- 
pose or the purpose of the whole advertisement by allotting too 
little space to it. 

This brief study points to three conclusions, namely, that the 
return coupon cannot adequately measure the effectiveness of an 
advertisement or a medium, except in direct mail selling, because 
of the complexity of factors on which reaction to it depends; that 
as used at present it lacks sufficient attention-getting power to at- 
tract more than a small portion of the readers of the advertise- 
ment; and that in the size in which it most commonly appears it 
is likely to create an unpleasant feeling-tone because of inadequate 
space allowed for writing. 








FLUCTUATIONS IN MENTAL PRODUCTION WHEN 
MOTIVATION IS THE MAIN VARIABLE 


By F. B. KNIGHT and H. H. REMMERS 


GAP BETWEEN POSSIBLE AND ACTUAL PRODUCTION 
IN COLLEGE STUDENTS 


No one knows with useful accuracy just how great is the gap 
between the potential capacity of a college student and his ac- 
complishment in college courses. There are, no doubt, many stu- 
dents who, if we make a rough use of Franzen’s Accomplishment 
Quotient, possess 100 units of potential capacity and whose ac- 
complishment in college work is appropriately 100 units. These 
are efficient students, for there is a negligible gap between potent- 
ial power and mental production. On an invested capital of abil- 
ity the yield is satisfactory. 

There are other students who possess say 100 units of capacity, 
but whose mental production is 80 units, or 70 units, or even prob- 
ably 50 units. These students do not do as much as they should 
when their ability to work is considered. We might very well 
suppose that three students have 100 units of potential power 
each, and one of the three turns in 98 units of work; another 84; 
and the third 70 units. The second and third are not then doing 
the amount or quality of work we can fairly expect in the light 
of their potential capacity. They are not doing mental work ef- 
fectively. 

In our high schools we are rapidly getting our requirements for 
individual students on a scientific basis. Pupils with approxi- 
mately equal mental ability must do approximate work. The same 
quantity or quality of work to be produced by three pupils, say 
80 units in each case, is not estimated a “B” for all students ir- 
respective of their difference of ability. A student with 120 units 
of ability who hands in 80 units production may be failed, whereas 
80 units of production by a student of 100 units of capacity would 
be judged as creditable work, and one of 80 units production with 
an 80-unit mind should receive definite commendation for his 
work.’ 

The college instructor is a genuine teacher to the extent that the 
gap between potential power and actual accomplishment in each 

10f course we do not mean literally that intellect need or even should be 
thought of as a collection of units. This phrase is used only to make plain 
that differences in intellectual ability, and the relation of these differences 
in college work is of importance. Neither can we describe college work in 
terms of units. It would be a clear gain if we could. The point is that 
the relative efficiency of students’ work, differences of mental ability 


counted in, is better than considerations of work difference of mental ability 
neglected 
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one of his pupils is small and tends to grow smaller. That wide 
gans of this kind exist in most students in the average college of 
liberal arts could probably be neatly demonstrated if time and 
money for such a research were available. 


WHY THE GAP EXISTS 

The size of this difference between what a student could do and 
what he does do is a function of many factors, among which are 
undoubtedly, (a) the irrelevance of the subject matter to the pur- 
poses of the student, (b) the successful competition of many extra- 
curricular activities, (c) the inadequate manipulation of the laws 
of learning by the teacher and the like. 

Now the laws of learning or the laws of correct teaching operate 
as truly in the college classroom as they do in the primary school 
or the eighth grade. Lack of sound pedagogy in the classrooms of 
a university can not be defended on any ground. In fact there is 
a growing realization on the part of many that skill in teaching 
as teaching must be highly prized and honored by college admin- 
istrators. 

Instruction on “How to Teach” is being provided for the younger 
members of several university staffs already. 

DATA ON ATTENTION 

One of the necessary conditions of learning is attention of a 
high level on the part of the learners. It may be properly asked, 
“What are the attention habits of college students?” 

The following data may be interpreted as very strongly sug- 
gesting that one reason for the gap between what could be done 
and what is done by college students is that the kind of attention 
(or lack of it) given by students in the college classroom hinders 
if it does not make impossible, rapid or accurate learning. If in 
this simple chart, line A means a very low degree of attention; 
line B means the kind of listless, passive attention that one gives 
to familiar surroundings while riding; line C represents a rapt 
attention, such as one gives to the reading of a telegram or to 
the tense crises in a foot-ball game we can see that the amount 
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of time spent in a recitation or lecture with attention of A type 
or B type is really but little more than time out. The kind of 
learning required in college work takes place rarely if ever except 
when the matter in hand is attacked with type C attention. What 
per cent of a 50-minute recitation is spent by a student giving 
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type C attention? What per cent type B or type A? No one 
knows, and of course the individual variations are very wide indeed. 
A student does not know with accuracy the amount of time spent in 
the subject matter with these types of attention. His opinion, 
though, is probably better than a guess. On the whole he is apt 
to report the matter with errors favorable rather than unfavorable 
for himself.’ 

The writers selected groups of college juniors for study. The 
students were on the whole among the “better sort.” Their opin- 
ions concerning the kind of attention actually paid in classroom 
work is reported in the following tables: 


TABLE I 
DISTRIBUTION OF KINDS OF ATTENTION IN PER CENT OF TOTAL 
RECITATION TIME FOR CULTURAL COURSES, 
COLLEGE OF LIBERAL ARTS 

% A B 
5 5 
10 9 1 
15 8 
20 11 
25 3 
30 
35 
40 
45 
50 
55 
60 
65 ¢ 
70 
75 
80 
85 
90 1 


A, B, and C refer to kind of attention as defined in context; % 
refers to % of average recitation. To be read five students thought 
5% of the recitation time involved type A attention; nine students 
thought 10% of the time involved type A attention, etc. 

These distributions give a spread of the type of attention juniors 
give in the average recitation. The central tendencies are: (1) 
that high, effective attention, type C, is given 15%-20% of the 
time; (2) that a lower, less acute attention is given 55%-60% of 
the time; (3) that between 15%-20% of the time practically no 


*For data on the unconscious tendency of over-estimation see: Pitfalls 
and Rating Schemes, Knight-Franzen (Journal of Educational Research, 
May, 1922.) 
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attention is given at all. Little learning can go on when atten- 
tion of type A or B is given. Obviously a central improvement in 
college instruction would be to make those changes in class pro- 
cedure: which would tend to cause a large percentage of recita- 
tion time to involve that type of attention which is essential to the 
learning process. 

When we have a group of students in the college of commerce 
estimate the percentage of time in the recitation given to the three 
types of attention, we get a slightly different result. The central 
tendency of the lowest kind of attention drops 5%. The middle 
50-50 type of attention takes up about 50% of the time. The ef- 
fective type of attention is the rule about 25% of the time, slightly 
higher than in orothodox cultural courses. These data are spread 
in Table II. 


TABLE IT 
COLLEGE OF COMMERCE 
% A B C 
5 8 
10 3 
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When, however, we go over into the definitely vocational courses 
of law, medicine, and engineering, the opinion of the students as 
to the kind of attention paid in class is in startling variance with 
comparable opinion of students in the college of liberal arts. Data 
on 50 law students is spread on Table III. 


*Read the same as Table I. 
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TABLE III 
LAW COLLEGE 
% A B C 
5 21 15 
10 14 8 
15 6 6 
20 4 3 
25 1 3 
30 0 2 
35 1 1 
40 1 0 
5 0 
50 0 2 
55 2 0 
60 2 
65 3 
70 4 
75 ~ 


ie 2) 
- « 
—) 
_ 


Ww 


85 
90 


95 


— 


Summary of these data: The drift toward effective attention 
habits is significantly paralleled by drift in curricula toward voca- 
tional content. This means, in the opinion of the writers, nothing 
more or less than conscious motivation causes large fluctuations 
in mental production. 

A SECOND INVESTIGATION 

The above data are worthy of consideration despite the absence 
of objective measures of mental production. We now will de- 
scribe an experiment in which adequate objective data further 
support our contention. 

THE SUBJECTS OF THE EXPERIMENT 

The subjects of this experiment were ten college freshmen who 
had been subjected to a strenuous regime of humiliation and 
fatigue duty during their probation period as pledges of a fra- 
ternity. A brief summary of this regimen is in order here. 

The first day of probation each one of the ten subjects was 
given a rather strong dose of physic (two Hinkle’s pills).. They 
were not permitted to shave or bathe during the entire period of 
probation. A fake branding, very realistically carried through, 
which, according to the testimony of the chairman of the initiation 
committee was obviously a considerable mental shock, was a part 
of the program. Each freshman had to carry with him wherever 
he went some rather weighty or bulky article as a brick, a pail, 
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a piece of gas-pipe, ete. Besides these quantitatively rather un- 
measurable indignities, which further included generous doses of 
“paddling” with barrel-staves, the eating of raw liver (ostensibly 
dog-meat) sprinkled with asafoetida; the application of embar- 
rassing, rather humiliating names to the pledges, etc., perhaps the 
more significant features from the point of view of this experi- 
ment are that the men were permitted to sleep only from one to 
two hours out of every twenty-four, and that during each of 
their twenty-two waking hours they were compelled to do menial 
or nonsensical tasks about the house, or more frequently to go for 
long hikes of 12 to 15 miles. In addition to all this, they were 
required to attend to their regular university work. 

The probation period began on Monday, and the experiment to 
be described here was carried out on the following Friday late at 
night. Here we have a group of workers, fatigued, harassed, and 
working at a time when the social customs of a college campus 
make work unusually distasteful. Eagerness for entrance to 
the fraternity was ostensibly related to scores made during 
this work period. Thus the cards were stacked against this group 
except for motivation. 

A few days later a group of 54 college juniors (mostly women) 
were given a similar but shorter test on the same material as a 
means of securing comparable data from “normal” individuals. 

Comparison of scores made on intelligence tests) showed no signi- 
ficant differences between the two groups. The juniors were 
working under good conditions but with no special motivation. 
Thus the cards were stacked in their favor for no motivation 
other than those operating in a normal college recitation. 

MATERIAL AND PROCEDURE OF THE INVESTIGATION 

The materials used were the Thorndike addition sheets, each of 
which has forty-eight columns of ten addends with 1’s and 0’s 
omitted. “Any successive five of the columns are of a difficulty 
nearly, if not exactly equal.”* These sheets were arranged in pads. 
There was no possibility of the subject’s remembering the answers, 
for there was no possibility of identifying the seven different 
kinds of sheets. 

Our handicapped but motivated group was told by a member 
of the fraternity that they were to assist in a scientific experi- 
ment and that they were to push themselves to the limit as scores 
on the tests were involved in final election. Desire to qualify for 
the fraternity was the motivation. The attitude of the group as 
a whole was one of cordial co-operation. At the signal “Go!” 
they were to begin adding and to continue until a signal “Mark!” 
was given. This was done at the end of every five minute period. 


' ‘See Garth, T.R., Mental Fatigue, Archives of Psych. No. 41, August, 1918. 
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Twenty-four such five-minute work periods were given with a ten- 
minute intermission at the end of the first twelve periods, at 
which time the subjects were given light refreshments. The 
“Mark” signal was for the purpose of getting an introspective 
check on the feeling-tone of the subjects as well as to measure 
progress by units of time. They were told to mark A, B, C, D, E, 
or F, where A signified a feeling of general well-being, no nega- 
tive feeling-tone or discomfort; E was to be given the significance 
of intense aversion or high negative feeling-tone; C was to mean 
a rather neutral, don’t-care-one-way-or-another attitude. Durng 
the last three periods a strong effort was made to obtain an end 
spurt by urging the men to put all they had in them into the work. 


AVERAGE AND INDIVIDUAL SCORES OF THE TEN TIRED 
FRESHMEN ADEQUATELY MOTIVATED 


The raw data give a composite work curve that is very nearly 
a straight line (See Figure 1) for both the total amount done and 
the amount correct. Nor is there any noteworthy falling off in 
accuracy. The data from which this curve is constructed are 
shown in Table I.° The last five-minute period does show a 
smaller per cent correct than do any of the other periods, but the 
difference is slight. The curve of the percentage correct is shown 
in Figure 1 at the top. The straight horizontal line through the 
curve represents the arithmetic mean of the percentile values for 
each five-minute period. Whatever decrease in total amount or 
amount correct occurred may be shown by comparing the work of 
the first 12 periods with that of the last 12. Table II gives the 
data for each person so compared as well as the totals for the 
entire 24 periods. The percentage of correct work for the first 
12 periods is 85, that for the last 12, 83—a decrease of but two 
per cent! 

The individual work curves are shown in Fig. 2.’ Perhaps their 
most striking feature is the consistent downward trend in the 
curve of satisfyingness and the lack of a significant corresponding 
downward trend in the output. The curve of satisfyingness’—in 
the case of every individual except Fitz—shows a marked de- 
cline with the progress of the experiment. In his case the inter- 
mission for refreshments proved to be annoying rather than sat- 
isfying. 

°As appears from the individual work curve (Fig. 2) subject B—came 
four periods late. In order to calculate averages for the whole group, his 
total output was weighted with his average production per five-minute 
periods for the twenty periods that he actually worked. This procedure 


would seem to be fair. The general conclusions are not in any case dif- 
ferent than they would have been without such weighting. 
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TABLE I 
Showing the amount of work, the amount correct, the amount 
incorrect, and the per cent correct for the ten freshmen in terms 
of columns added per five-minute period. 


Per cent 
Period Total Am’t. Total Correct Total Incorrect Correct 

1 223 188 35 84 
2 237 206 31 87 
3 188 158 30 84 
4 206 176 30 85 
5 217 179 38 82 
6 219 194 25 89 
7 218 183 35 84 
8 208 175 33 84 
9 223 183 40 82 
10 216 181 35 84 
11 186 160 26 86 
12 212 176 36 83 
13 227 189 38 83 
14 240 197 43 82 
15 215 188 27 87 
16 194 162 32 84 
17 205 172 33 84 
18 204 184 20 90 
19 207 179 28 82 
20 188 162 26 86 
21 191 155 36 81 
22 198 157 41 80 
23 200 169 31 85 
24 203 157 46 77 
Av. per person 21.9 17.8 4.1 —- 
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TABLE II 

Showing data for 10 individuals in terms of number of columns 
added. 

as ss 3 ce 6U SU OUR USEUAW BCD EF 
= - ze 2° a Sh os os 
n a a —~S Ce ore el | nN 
Fitz536 440 82.09 257 205 279 235 23 1 
Fo 490 429 87.55 247 216 243 213 ee 8-8 3 4 
Gi 566 503 88.86 329 292 237 211 76-42 8 
Har 536 459 85.63 277 233 259 226 Se 
Bai 352 276 78.41 188 185 170 141 18 6 8 2 1 
An 681 478 S227 282 287 209 241 183 3 2 383 8 
Gr 518 466 89.96 264 236 254 230 :.§ 23-2 
Ne 705 571 80.99 358 300 347 271 414 6 
Bo 312 229 73.40 150 128 162 101 2 = 6.3 8 
DeS464 417 89.87 220 200 244 217 . pa eS 


Av. 506.0 426.8 84. 256.6 218.2 249.4 208.8 70 51 46 24 16 15 


The curves show no wide divergences of amount correct from 
total amount done—they practically parallel each.other from one 
five-minute period to the next throughout the experiment. 

The features of the work curve contended for by Kraepelin and 
his followers are not in evidence—at least not consistently for all 
the individuals. There is little if any more fluctuation upward at 
the beginning or at the end to indicate “Anlauf” or “Schlussan- 
trieb” than chance would account for. Subjects Fits—, Fo—, 
Des—, Bai—, An— and Gr— show an increase both in amount 
done and amount correct from the first to the second period. Bo—, 
Ne— and Hav— on the other hand show a decrease, while Gi— 
does not change in total amount done. The terminal spurt is no 
more in evidence, notwithstanding an effort to produce one by en- 
couraging and urging the men to increase their speed if possible. 
The amount done shows an upward trend for Fits—, Gi—,DeS—-; 


Ne—, Hav—, Bai—, Gr— and Bo— however, exhibit a down- 
ward curve. Nor did the intermission cause any uniform change 
in output. 





Fig. 3 shows graphically in another way the relation between 
output and feeling tone. The curves were constructed by dividing 
the total number of A’s, B’s, C’s and so forth, into the number 


*The range of feeling tone indicated by each subject is plotted at the 
right of the graph, as indicated by the letters A, B, C, D, E, F. 
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of columns marked by each, and plotting these results against the 
letters on the abscissae. If feeling-tone correlated with output 
there should then be a consistent more or less parallel downward 
trend for the two curves. Some of the curves—those for Bo—, 
DeS—, Hav—, Gr—, and Ne—, do show somewhat of a downward 
trend. Fo— and Bai— show practically no change; Gi—, however, 
increased markedly in output when his feeling-tone was most neg- 
ative; this is also true to a somewhat less extent for An—. 


SUMMARY 

The experiment was carried on for 24 five-minute periods— 
that is, two working hours. The ten subjects had a total output 
of 5060 columns added, of which 4268 or approximately 84 per 
cent were correct. The average total amount added was approx- 
imately 21 columns per five-minute period per person; the aver- 
age amount correct per five-minute period per person was ap- 
proximately 18 columns. The range of individual achievement is 
from 40 columns attempted with 36 columns correct for subject 
Nel—, to ten attempts and two columns correct for subject Bo—. 


CONCLUSIONS’ 

1. A two-hour work period, with this kind of material for in- 
dividuals who are already physically tired, is not sufficient to 
decrease the output to any appreciable extent. 

2. There is little or no apparent relation between feeling-tone 
and amount of work done. 

3. There is no evidence of a warming-up period or of initial 
spurt. 

4. Accuracy is little affected. 

5. In spite of an effort to obtain an end spurt, none was ob- 
tained. 


DATA OF FIFTY-FOUR RESTED JUNIORS WITH ONLY 
CLASS-ROOM MOTIVATION 

The essential facts for this group are shown in Table III. The 
material used for the group was’exactly like that used for the ten 
freshmen. Fig. 4 shows graphically the work curves for the group, 
both as to total amount and amount correct for each five-minute 
work period. There is little that is noteworthy concerning these 
data except the very greatly reduced output as compared with the 
freshmen. Possibly this is to be accounted for on the basis of 
class-room ‘set’ rather than on the ground of less ability in adding. 
There is nothing to warrant the assumption that the freshmen as 
a group were more skillful at this sort of thing by reason of 
previous training. The differential of motivation and mental at- 
titudes is the explanation of the differences of an average of 21 


These data give insight into certain aspects of “Mental Fatigue.” We 
are using them only to show amounts of work as related to motivation. 
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columns per five-minute period for the freshmen as against an 
average of 11 for the juniors.* 


The percentage correct for the juniors is in general the same as . 
for the freshmen. It will be observed, however, that the decrease ; 
in amount correct is approximately the same for the juniors as 
for the freshmen. There is no reason to believe that this decrease 
is due to fatigue; the test was given at 8:00 a. m., the first hour 
of the academic day. 


TABLE III 


Data obtained from 54 college juniors in terms of number of 
columns added: 


TR ke Be eet 2 

a Mia a 

z . 2 Se wis $5 Sees 5 

SE =? 32 OF Soh Fm g4% Fa 

oo 3& 26 $6 455 oS 4535S wS 

7 Pa a< BO ~o pote toe Om SORE Om 

6 s 

[st 5 min. 998 872 87 18 4.18 16 4.57 
2d 5 min. 843 720 84 16 4.90 13 5.25 
3rd 5 min. 902 755 85 17 5.46 14 6.14 
4th 5 min. 868 725 82 16 4.92 13 5.05 
5th 5 min. 836 690 83 15 4.96 13 5.20 
6th 5 min. 857 710 83 16 4.71 13 4.96 
7th 5 min. 876 726 81 16 5.09 13 5.53 

Av. 6178 5198 83.125 16.3 13.6 
CONCLUSION 


The above data support the following assertion: The difference 
between a genuine motivation such as the freshmen had, and the 
kind of motivation a college class contains, is a difference which 
not only offsets extreme fatigue, but further offsets freshman vs. 
junior ability, and in addition produces twice as much work per 
unit of time with equal accuracy. 


*The juniors were just an ordinary teaching group, a typical learning 
situation. 
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INFLUENCE OF MENTAL LEVEL IN THE 
FORMATION OF BOYS’ GANGS' 


By M. LAVINIA WARNER, 
Ohio University 


It has been noticed that certain boys group together; that a 
boy will have certain friends and not seem to have any other very 
close friends. There may be several circles of friends among the 
boys in a community and the circles will not intersect. 

This seems to indicate that there is a fundamental cause why 
certain groups of boys are found together and that it is not chance 
but some underlying law that groups boys. The boys are brought 
together by churches, schools, and the like, but the churches and 
the schools do not classify them into the different groups and gangs. 
Hence the question arises: Why are certain boys found in one 
gang and not found in some other gang? 

Dr. Charles B. Davenport in speaking of the mating of the 
feeble-minded says, “The meager social life, the customs of their 
parents, the natural ostracisms of the higher classes, and the in- 
dividual’s preference for a congenial mate induces in marriage 
among the mentally deficient.” (2)* He found that the more fre- 
quent results of immigration of a feebleminded is his marriage 
into another defective strain in another part of the country. Altho 
in different localities, their social standards have been practically 
the same, for social standards are determined by mentalities as 
wel) as by experience, education and wealth. Consequently their 
social standards have been influenced mostly by mental level and 
inmarriages of the defective strains have resulted. “The social 
laws and natural preference of the individual is so powerful that 
like will consort with like even in outmarriage.” (2) 

Dr. Davenport says, “The law-that like tends to marry like in- 
terferes with the beneficient tendency (that the introduction of 
new blood would lead to better progeny) of outmating. The con- 
sorts selected from outside are frequently quite as defective as 
those who select them.” (3) 

Miss Mina A. Sessions gives the same ideas of marriage selec- 
tion as those given by Dr. Davenport. Miss Sessions says, “The 
reason for so many cousin marriages are probably twofold. First 
in their own stock they find the most congenial companions, and ° 
second, they are largely forced to marry each other for the simple 
reason that no one else will marry them or have anything to do 













































‘This article is a condensed form. Complete article may be found at 
Carnegie Library, Ohio University, Athens, Ohio, and in Columbia Univer- 
sity Library, New York City. 

*Numbers throughout this thesis refer to Bibliography listed at the end. 
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with them. When they do marry outside the family, they marry 
some member of a weak strain.” (10) These investigations lead 
us to believe that “Birds of a feather flock together,” even in mar- 
riages of the feeble-minded. 

The same idea holds true in marriages of the deaf. Dr. Fay 
says, “In no human relation is the truth of the proverbs, ‘Birds of 
a feather flock together’ and ‘A fellow-feeling makes one wonder- 
ous wise!’ more fully exemplified than in the friendships and mar- 
riages of the deaf.” 

In his study of the divorces of the deaf, Dr. Fay found the pro- 
portion where both partners were deaf to be 2.5 per cent and where 
one was deaf the other hearing, the proportion reported is 6.4 per 
cent. This shows that the matings were more successful where 
both partners were deaf than in cases where one was a hearing 
person. The deaf partners were more congenial than one of them 
could have been if he were connected by hearing and speaking with 
those around him. (5) 

The deaf have a common mode of expression. This enables 
them to appreciate and enjoy one another. Each is more or less 
barred from the hearing folk and he is glad to have a companion 
similar to himself. Their deafness causes them to have common 
experiences. It is the effect of their deafness that attracts and 
holds them in companionship. 

Dr. Davenport and Miss Sessions determined for the feeble- 
minded and Dr. Fay determined for the deaf, that there are com- 
mon factors at work in bringing and holding together each of 
these respective types of people. This suggests that the same 
principle could be applied to other types. 

In order to make a study of the factors determining boys’ gangs, 
various groups.of boys who have been brought into court and who 
were known to be close companions by court officials, social work- 
ers and school officers, have been investigated. The data was de- 
rived from school reports, court records, and the records filed by 
the Children’s Service Bureau, Youngstown, Ohio. The social his- 
tories are the reports of investigations made by the bureau’s special 
social workers. The pyschological examinations were made by Dr. 
H. H. Young, director of the Children’s Service Bureau and his 
assistants in psychology. 


Thirteen groups and seven pairs of boys were tabulated and 
sixty-six different individuals were included in the study. 

The ages in each group were computed to the date of crime com- 
mitted by that group, or if no particular date could be obtained, a 
given date was taken. So in each group both the chronological 
and the mental. ages of each child are given for the same date as 
that of his companion. 
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MENTAL LEVEL OF BOY’S GANGS 


REPRESENTATIVE CLUBS AND GROUPS 
THE FRANKLIN CLUB 

A group of boys traveling about the city doing many little things 
that were not to be favored, attracted the attention of court of- 
ficers and social workers. These boys were well equipped for vari- 
ous sorts of business that require no arrangements with the other 
party concerned. 

The gang had three horses and an old wagon. They drove 
about the city picking up bottles, rubber and the like and “stealing 
oranges, bananas, whips, pop, beer, whiskey, cigarettes, copper,” 
etc. 

When about nine years old, A. S. one member of the gang, told 
of cutting the windows at a gun shop and how they were chased 
by a policeman. He said that they were caught twice and that 
vhen caught they gave wrong names. He said that they generally 
drove the best horse so that they could make a better get-away. 
They got up at two or three o’clock in the morning and “went out 
picking,” returning about five o’clock. 

A. S. gave many instances which show that they are a real gang. 
Their environment leads them to do anti-social things and with 
their low mentalities, they do not always make a successful get- 
away. In all probability they are caught many more times than 
they would be if they were of higher mentalities. 

Their main work seems to have been stealing automobile parts 
from one party and selling them to another. In one case they 
would take the materials back and sell them to the “Jew” from 
whom they had stolen them. 

The gang was first noticed in particular when they broke into 
a jewelry store window in order to steal some nickels that were on 
display in the window. The boys got the idea of cutting the glass 
from a picture show. The actor cut the glass with a glass cutter, 
then tapped it lightly until a smooth piece fell out. The boys had 
gotten some glass cutters from a twenty-five cent store the day be- 
fore. They had stolen some of the cutters and had bought some. 
They were not as successful as the actor on the screen, however, 
in cutting the glass. A policeman came along, arrested them and 
took them to the Detention Home. They were given a hearing in 
the Juvenile Court and the judge suspended sentence in each case. 

A few days later a boys’ worker formed this group of boys into 
a club, later called the Franklin Club. The boys’ worker tried to 
interest them in athletics and games of various kinds. They were 
taken to the park on hikes and in every case their individual inter- 
ests were studied and appealed to as far as possible. 

For a period of a few weeks after these boys had been organized 
into the club, they seemed to be doing well, but they were soon 
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back to some of their former habits. About a month after their 
organization, some of them played truant and some one stole from a 
store in the community. Comparing their record of the past year 
with their behavior at this time, little improvement was found. 
After working with these boys and not getting the results which 
the boys’ worker hoped to get, he took them to the Children’s 
Service Bureau for mental examinations. 

The following table shows the age, mental age and I. Q. of each 
at the time of breaking the jewelry store window: 


TABLE I 
THE FRANKLIN CLUB 
Chronological Age Mental Age 
Name Years Months Years Months I. @. 
. ©. 10 0 7 10 78 
ds Bs 11 9 7 8 65 
R. L. 13 2 8 1 61 
gay: J 11 5 7 11 69 
; om 8 11 0 7 2 65 
Ss. V. 13 1 9 1 69 


Table II gives the mean variation of the chronological ages to 
be eleven months and that of the mental ages to be only five 
months. 


TABLE II 
AVERAGE AND MEAN VARIATION 
Chronological Age Mental Age 
Years Months Years Months I. Q. 
Average 11 9 7 11 68 
Mean Variation 0 11 0 5 4.16 


Table I shows that the boy with the highest I. Q. is the youngest 
chronologically and that the boy with the lowest I. Q. is the oldest 
chronologically. The case of S. V. who ranks a year and more 
above the others in mental age, is explained in his record which 
shows that he is slow and easily led by others. Dr. Young states 
that S. V. is incapable of making his own decision. He is highly 
suggestible and is largely dependent upon his immediate environ- 
ment and associates. His easy way and lack of will power is 
probably the cause of his being led by lower mentalities. Taking 
the group as a whole, the tables show that the mental ages bear 
a close sameness. 


There are other gangs of boys in the neighborhood, but the 
Franklin boys are not found with them. The Franklin boys at- 
tend regular public schools, and this chance to meet other com- 
panions does not seem to affect their choice. Altho they come 
in contact with the higher boys, the duration of their companion- 
ship is so short and non-social that no noticeable event takes place. 
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No doubt, the mentally higher boys commit little crimes, but they 
bar from their company any one who is liable to expose them. 
Consequently the type of mind found in the Franklin club has no 
place with the higher type child. Then the lower level boy would 
not be able to contribute to the success of the gang and being a 
misfit he would go where he could be nearer the top in action and 
where he would be on an equal with his fellows. 

The public school attendance department shows that in cases 
of truancy among the Franklin club, the boys were either alone 
or in company with another member of the club. However, the 
record of J. F. shows that he played truant three times with two 
other boys but their mental ages do not differ from J. F.’s more 
than six months and the I. Q.’s are within the range of seven units. 

Court records show that boys from this club have committed 
ten different crimes in company with one another, but there is no 
account of any one of them being arrested in company with a boy 
who rates with normal mentality. Three boys from the Franklin 
club, however, have been in court with boys from other groups 
which are included as subjects for this theme and in each case the 
other boys rate below an I. Q. of 69 and with a mental age not dif- 
fering a year and seven months from that of the Franklin boys, 
and within six months on the average. 

GROUPS NUMBERED SEVEN, ELEVEN AND THIRTEEN 

Of the boys examined in group seven, the one with the highest 
I. Q. is the youngest chronologically, and has the highest mental 
age. Altho the difference in mental age is greater than that in the 
Franklin club, yet the same tendency that like mental ages tend 
to group together is found here. 


TABLE III 
GROUP SEVEN 
Chronological Age Mental Age 

Name Years Months Years Months I. Q. 
Cc. B. 15 0 10 8 7 
J. B. 15 8 8 6 5 
J. A. 15 5 No mental rating 
J. B.A. 13 3 No mental rating 
R. C. 15 6 9 4 60 


C. B. and J. B. are very close friends, C. B. being the follower 
and admirer of J. B. J. B. is so low mentally that he will jump 
into any undertaking without fear of consequences and C. B. is 
low enough to follow the example. C. B. is not troublesomely in- 
clined and he would probably cause society less trouble if it were 
not for his close companionship with J. B. Although C. B. is not 
so low mentally as J. B. yet because J. B. is older and larger and 
is able to do striking things, C. B. easily falls in line with all his 
misdemeanors. 
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Dr. Young says, “Time and again through the interview con- 
cerning C. B.’s past history, he referred to how much he thinks 
about J. B. taking money. He seems to be much impressed and to 
think much about how J. B. takes money from his father. The 
repeated reference to this together with the outline of what and 
how C. B. himself has stolen and what induced him to steal, seems 
to indicate that he has a dominating mental complex about J. B. 
and his stealing which with his low mentality is very largely, if 
not entirely the cause of his own stealing.” 


The individual! records of C. B. and J. B. are very similar. Their 
histories of theft and night wanderings are practically the same. 
It is not exactly an influence which J. B. has over C. B., but it is 
a kind of unconsciousness on J. B.’s part and a thoughtless fol- 
lowing on the part of C. B. 


J. A. and R. C. have also led C. B. to steal. J. A. was not given 
a mental examination, but his schoo! record shows him to be re- 
tarded more than three years and he has a long court record and 
a boys’ industrial school record. His father reports that he can- 
not control him. 


J. B. A. had no mental rating either, but his school record shows 
poor attendance and the principal reports that J. B. A. is in one 
day and out for weeks, and that all his school work is a complete 
failure. He is pushed on in the grades and does not fit anywhere. 
He is three years retarded in all his subjects. 


The histories of J. A. and J. B. A. and the mental ratings of 
the other boys show the group to be below normal and their mental- 
ities to be somewhat equal. As in the Franklin club we have no 
account of these boys chumming with mentally higher boys or 
boys of the normal type. 


J. B. and C. B. appear in Group Thirteen with A. S. a member of 
the Franklin club and with two other boys whose mentalities are 
undetermined, but whose social standards are rated by social 
workers with those of their companions. Taking these boys as a 
group, we have a difference in mental ages of two years and six 
months and a difference in chronological ages of three years and 
five months. All are so mentally equipped that they are able to 
enjoy the same things. Each is in a group where like types are 
found. The I. Q.’s of these boys are below 71 and the I. Q.’s of 
their friends are below 78. 


In the study of Group Eleven, it was found that the chronolog- 
ical ages are in an inverse ratio to the mental ages. 
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TABLE IV 
GROUP ELEVEN 
Choronological Age Mental Age 
Name Years Months Years Months I. Q. 
A. K. 15 1 8 10 59 
Ss. J. 13 8 9 7 70 
G. B. 12 7 10 8 85 


The experience given the older children by living longer in the 
world tends to place them on an equal basis with the younger 
children who have higher mental ages. Consequently the experi- 
ence on the one side and the experience and the higher mental 
ability on the other enable the boys to enjoy one another’s com- 
pany. The inverse ratio is found in Table I also, but it is not 
emphasized there so much. In fact, this tendency is found in all 
the groups. 

Table V shows that with. the exception of three groups and 
one pair, the I. Q. of the youngest boy is greater than the I. Q. of 
the oldest boy. 


TABLE V 
GROUPS OF MORE THAN THREE BOYS 

Groups Oldest Youngest Difference 
1 61 78 17 
2 92 69 23 
Franklin 61 78 17 
7 54 71 17 
9 72 93 21 
10 58 73 15 

GROUPS OF THREE 
4 62 56 6 
5 80 80 0 
6 56 75 17 
8 ‘92 72 20 
11 59 85 26 
12 58 82 24 
13 54 65 11 
PAIRS 

14 69 72 3 
15 59 85 26 
16 82 93 11 
17 2 72 10 
18 62 72 10 
19 , 105 70 35 
20 63 100 37 


Average difference 15 





231 WARNER 


This shows that older boys with mentalities below normal have 
a tendency to group with younger boys who have a mental age 
near their own and slightly higher. The higher mentalities of 
the younger boys enable them to cope with the older boys who have 
the experience of living longer in the world. This illustrates how 
mental equality exists in groups formed by natural selection and 
indicates that mental abilities tend to go together. 


GROUP CONTAINING BOYS OF NORMAL MENTAL ABILITY 


In three groups and in two pairs boys were found who rank 
with much higher intelligence than the other members of the 
groups. These conditions were explained when it was found that 
in each case, excepting one, the boy with the I. Q. much higher 
than the others was psychopathic, and he also had bad heredity 
which would influence his choice of society or would lower his 
standards of society. 


In one pair, the lower type boy was twelve years and one month 
old and mentally eight years and six months with an I. Q. of 70 
and the other was twelve years and eight months old and 
mentally thirteen years and four months with an I. Q. of 105. 
The mentally higher boy is an adopted child. Nothing is known 
of his father. The whereabouts of his mother is unknown. His 
mother left him at a Salvation Army Rescue Home when he was 
two years old and never returned for him. The mother is reported 
to be a drinking and immoral woman. The foster mother says 
that the boy is very bossy and always wants others to do as he 
wishes. His psychological record shows that “he does not have 
the persistance to work out a task requiring careful decision and 
forethought. He is psychopathic and impulsive to the extent that 
thru the path of least resistance he fails to use the intelligence 
which he has.” The principal of the school where he attended 
reports that his work was extremely poor, his behavior was very 
queer and that he had “a mean disposition.” His foster parents 
appealed to the court for a disciplinary school for the boy. They 
felt that he néeded stricter discipline than they wanted to give 
him. 

The family of the mentally lower boy moved into the same house 
with these foster parents. The boys were not acquainted before 
this. The younger boy is one of those nice looking, talkative chaps 
who is in for everything and who will undertake anything. The 
father does not live at home and the mother is unable to control 
the child. The older boy not caring to “use the intelligence which 
he has” would be glad to have anyone for a companion who would 
follow his suggestions and who would not be hampered by fear to 
do anything which might come up before the lads. The boys played 
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truant together several times and once they ran off to another 
city together. 

In another case the boy who was rated as average mental ability 
was considered by his mental examiners as one who would be 
readily influenced by his associates; that he would probably not 
initiate much, but might readily become a follower. He is psycho- 
pathic. 

Another boy is much younger than the other boys in his group, 
but his mental age is about the same. He is lacking in initiative 
and is somewhat psychopathic in his responses. He is easily in- 
fluenced and too readily submits to the wishes of others. 

The boy who is not psychopathic is considered a case of adoles- 
cent instability. Possibly later he will grow more stable and will 
seek other boys for companions. 

It would be interesting to continue the study of these boys to 
see if they still chum with the mentally lower boys in a few years 
following, but it is impossible to do that in this research. 


INFLUENCE OF NATIONALITY AND PROXIMITY 


A study was made of the nationalities represented in the differ- 
ent groups. Youngstown’s greater population, about eighty per 
cent is foreign. In most of the groups there would be as many 
nationalities represented as there were cases. It was found that 
the nationality did not enter in. If the boys were all Italians, it was 
because they lived and met in the same Italian district. 

Proximity and school effected the meeting of most of the groups 
and pairs of boys. In a few instances the boys met in the down 
town section where they sold papers or where they loitered around 
the cheap theatres. In one case the two boys lived on opposite 
sides of the city. They met down town several times, became very 
good friends and finally decided to steal enough money from their 
fathers to run away. In nine groups the boys of each group at- 
tended the same school, in ten groups the boys attended adjoining 
schools, and in one group three of the boys attended different 
schools found in the same quarter of the city and the other two 
boys in the group were the ones who lived on opposite side of the 
city. 

This division of these particular groups would not hold true so 


- very long, however, because the school system often transfers the 


pupils from one school to another and sometimes a child is placed 
in a Parochial school for a term or two. In many cases the 
children living on the dividing line of two school districts may 
have a better chance of getting acquainted than children attending 
the same school. The after school hours are not so closely super- 
vised as the school hours and the children may have more freedom 
in choosing their playmates. Even on the playground at school the 
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children are more or less held from freedom of choice by their 
school officers. Going to and coming from school and the free 
hours afterwards give the best opportunity for natural selection 
of companions. During these free hours the boys may wander 
where they like and they may play with whom they choose. If 
they do not like one group of boys, they may play alone or go 
off with one or two others who enjoy the same things which they 
enjoy. 
VARIATIONS OF BOTH THE CHRONOLOGICAL AN THE 
MENTAL AGES 

In Table VI we find that the mean variatio nof the mental age 
is less than the mean variation of the chronological age in three 
out of five of the largest groups, that is the groups containing 
more than three boys. This shows that there is a tendency for 
less variation among the mental ages than among the chronological 
ages. 

TABLE VI 
MEAN VARIATION AND RATIO OF MEAN 
VARIATION TO AVERAGE 


Mean Variation M. V. Average 
Chronological Mental Chronological Mental 

Group Age Age Age ~ Age 

1 13- months 7 months .09 .07 

2 3 15 .02 13 
Franklin 11 5 .08 05 
9 17 9 12 .08 

10 16 17 10 14 


Since the intelligence quotients are below 100, the mental ages 
are less than the chronological ages and are expected to be nearer 
together. In order to determine further that mental age has more 
weight than chronological age in the formation of groups of boys, 
the ratio of the mean variation to the average (coefficient of varia- 
tion) in both the chronological and the mental ages was com- 
puted in the larger groups and a comparison was made of their 
variability. It was found that the mental ages were less variable 
than the chronological ages in all the groups excepting in Group 
Two and Group Ten. See Table VI. In these two groups the 
chronological ages were slightly less variable than the mental ages. 

The average of the mean variation of all the groups and pairs 
was found to be’10 months for the chronological ages and 11 
months for'the mental ages. This includes the groups whose mean 
variation was higher for the mental age than for the chronological 
age and the pairs whose mean variations were not thought neces- 
sary to compute before because of the small number of boys in- 
cluded in each. The mean variation for the whole population (66) 
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was found to be 18 months for the chronological age and 16 months 
for the mental age. Thus we find: 








Chronological Age: Mental Age 

Average M. V. of Group 10 .05 11 
= —_ — .08 

M. V. of population 18 16 


In some groups, the chronological ages are quite similar. This 
is very marked in Group Two. In this group, all of the chronolog 
ical ages are in the twelfth year. The mean variation of the 
mental ages is greater than the chronological ages of Groups 
Four, Seven, Ten and Twelve. 

In Group Seven, the chronological ages are very similar and the 
mean variation of the mental ages is greater as near as could be 
discovered. Two members of this group have no mental rating so 
the mean variation of the mental ages could not be determined in 
exact terms for all the members in the group. 

The chronological ages of the two boys in Group Eight aze 
nearer together than their mental ages. The chronological ages of 
the other boys in this group could not be found and they were 
not given a mental test, so they might alter this group in either 
direction. There is very little difference in the chronological ages 
in the boys of Pair Fourteen and in Pair Nineteen, also. 

This shows that chronological ages might have some weight in 
the grouping of boys, but since there are more groups in which 
the mean variation of the mental ages is less than the mean varia- 
tion of the chronological ages, we might conclude that there is a 
tendency for the mental ages to have more weight in the formation 
of boy’s groups than chronological age. 

DISCUSSION 

In the Franklin Club the mean variation of the chronological 
ages is twice as great as that of the mental ages. In this group, 
mental age must have affected the grouping. Common experience 
enters into the grouping also. If one of the boys had lived until 
this time in a rural district he would not have enough experiences 
common to the others to fully appreciate the attitudes of the other 
boys altho their mental abilities are practically the same as his. 

Proximity influenced the grouping in most of the cases. It even 
had an influence in the case of the boys living on opposite sides of 
the city. It was a common experience which brought them together 
and it was their mental ages kept them together. If one of the 
boys had been mentally high and the other mentally low, their 
common experience of selling papers down town would scarcely 
have been noted. Common experience is the means of bringing the 
boys together and mental level is the determining factor which 
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continues to make their experiences common. 

Experience, education, and social status help determine the 
choice of comrades among adults of the same level and their social 
standards may differ so that their tendencies might be in an 
almost opposite direction, but among children the selections are 
less controlled. Children have more freedom in choosing their 
companions because they have not lived long enough to be in- 
fluenced in making their choice. Social standards, experience and 
education have not affected them to so great an extent. 

The boys in these twenty groups were not known to associate 
with boys who rank as normal, altho they had the same opportun- 
ity to get the same experience. Physically, their experiences were 
common, or could have been, but mentally each type of mind in- 
terpreted the experience to suit its own level. 

In several groups, the oldest boy has the lowest I. Q. and the 
youngest boy has the highest I. Q.; that is, the I. Q. varies in- 
versely with the chronological age. This brings the mental ages 
to an equalized level. 


Boys of the same mental level can understand and appreciate 
the same things. They have reached the same degree of progress 
in the stage of mental development. Their minds are so equipped 
that they can appreciate the same situations. Consequently, each 
is at home with the other. No one is comfortable for a very long 
time if he is much inferior to those about him. He must have a 
feeling of his own importance; that he has a place of his own 
in their society, or he will begin to shrink mentally away from 
them. On the other hand, if he can play his part on an equal 
basis with his companions, he will have a feeling of satisfaction 
and will be content to remain in their company. 

A boy mentally high will not be satisfied with companions sev- 
eral degrees below himself. They cannot reach his ideas and he 
will grow tired of being tied down to their level, so he will stand 
aloof or seek others for companionships, except in some cases 
where he takes a paternalistic attitude, and assumes a sort of 
guardianship or big brother attitude. 


A special class teacher reported that her two lowest children 
who were about seven years of age, would not have anything to 
do with one another. No doubt they were too young mentally 
to be social. They were mentally below four years, The older 
and higher type of children in the class enjoyed leading these 
younger boys about and caring for them. They had the paternal- 
istic attitude and the younger children enjoyed the feeling of their 
protection. We enjoy being with leaders or individuals who are 
our superiors because we feel that we are being benefited by their 
presence. They cause us to live in an ideal state which we enjoy. 
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The low type child enjoys the company of the older children be- 





ne the cause he likes to have them do things for him. He is taking a 
r social babyish attitude. 
in an This is not an attempt to solve the problem of group compan- 
ns are ionship in so brief a study of so few cases, but the investigation 
4 their indicates that mental age is the greatest factor in the selection of 
be in- one’s companions and in holding groups together. It indicates 
ice and ; that a common mental level with similar experiences as a back- 
ground is a much stronger factor in group formation than any 
ssociate other factors investigated. 
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LEGIBILITY OF BODONI, BASKERVILLE ROMAN, 
AND CHELTENHAM TYPE FACES 
By Haro.p E. Burtt and CoryNe BAScCH 
Ohio State University 





INTRODUCTION 

The importance of type legibility for advertising or general 
reading needs no comment. The outstanding variables condition- 
ing legibility are size, case and face. The present study is con- 
cerned primarily with the last of these and is confined to an 
intensive investigation of three common type faces. 

There are five methods (as far as the writers are aware), for 
measuring legibility: maximum distance at which type can be 
read, time taken to read a passage, number of letters read in a 
tachistoscope or minimum exposure at which they can be read, 
minimum illumination under which type can be seen and extent 
to which letters can be thrown out of focus and still be identified. 
The first two of these methods have been the ones most frequently 
used. The present study used the last of the five methods men- 
tioned. 

The most extensive study of type faces is that of Roethlein.’ 

She presented a page of 28 letters at a distance beyond the 
limits of legibility and then advanced it toward the observer by 
20 em. increments, recording at each stage the letters that could 
be identified. This method made it possible to accumulate a large 
amount of data in a reasonable time and a considerable number 
of different faces were studied. Cheltenham, however, was the 
only one which she used that was involved in the present study 
so no comparison is possible. 

Roethlein found that legibility depended on form, size and 
heaviness of the letters and on the margin, position in group 
and on the shape and size of adjacent letters. 

Scott® found that a longer time was taken to read a light-faced 
type than to read a heavy-faced type. 

Adams’* performed an experiment on speed of reading a series 
of advertisements set in a wide variety of type arrangements. 
The material was too complicated to throw much light on the 
specific problem of type face. 

The present study tried to abstract from all variables except 
type face. Individual letters were used, all of the same size. 


IRoethlein, B. W. The Relative Legibility of Different Faces of Printing 
Types. Amer. J. of Psych. 1912, 25, 1-36. 

“Cf, Hollingworth, H. L. Advertising and Selling. 1920, p. 7 

Adams, H. F idrertising and its Mental Laws, 1921, p. 214. 
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MATERIAL 

Three type faces were used in the experiment—Bodoni, Basker- 
ville Roman, and Cheltenham Medium. These were used in both 
upper and lower cases. All the letters were fourteen point so 
that the size variable was constant except in so far as the upper 
case letters often occupied greater area than the corresponding 
lower case of the same point. The letters were cut from sample 
booklets of the American Type Founders Company. 

Each letter was centered individually on white unglazed card- 
board 6x6 inches for exposure. It was impossible to get a com- 
plete alphabet in all the faces and cases. Consequently, the re- 
sults considered below were based on all the letters of the alphabet 
with the exception of J, K, L, Q, U, X, Y, Z. Several of these 
letters are rather infrequently used anyway. The remaining 
eighteen letters in the three faces and two cases make one hun- 
dred and eight letters involved in the investigation. Some other 
letters were included but the results not considered. 

The material was presented in a Focal Variator.* This ap- 
paratus comprises an exposure field about 6 inches square illumi- 
nated by daylight amps inside a hood. The material exposed is 
projected through a stationary convex lens which controls its size 
and then through two similar lenses which are arranged to move 
reciprocally so that the focal point of the image may be changed 
without altering the size. The material is projected through this 
differential system upon a siationary ground glass so that as the 
lenses move the image on the glass goes in and out of focus. The 
subject sitting with his forehead in a head-rest fifteen inches from 
the ground glass observes the images. In the present experiment 
a quarter-inch diaphragm was placed on the lens next the ground 
glass to lessen the marginal glare. This was desirable because of 
the small size of the letters used but did not affect the focus. 

The differential lens system was controlled by turning a small 
crank. This crank actuated a screw 16 threads to the inch so that 
for each revolution an indicator moved 1-16th of an inch on a 
linear scale. The crank was also attached to a small wheel with 
its margin graduated into 100ths so that it was possible to make 
readings to 1-100 of 1-16 of an inch. 

The data will be presented in terms of these scale readings. 
However, it is possible to convert such readings into terms of the 
position at which the image is actually in focus. 

The apparatus was set during the experiment so that at zero 
scale reading the image was perfectly in focus on the ground glass. 
The ecallibration curve was obtained by movin gthe ground glass 
toward the lenses by increments of one inch and at each point 


*Weiss. A. P. The Focal Variator, Journal of Experimental Psychology. 
1917, 2, 106-115. 
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determining the scale reading when the image was in perfect focus. 
These determinations were made by coming in both directions from 
illegibility until a letter showed no appreciable increase in clear- 
ness and continuing until it began to decrease in clearness. These 
four scale readings were averaged. This procedure was followed 
with five typical letters at each point and with two observers. 
Seale readings were plotted against distance of the focal point 
from the exposure field. The curve is practically a straight line 
whose equation by least squares is 2.09x-.64y+26.85—0 x repre- 
sents scale reading, and y represents distance of focal point from 
the exposure field. 


METHOD 

The experiment was conducted in a dark room. During an actual 
trial there was no illumination except the lights in the hood and 
a small heavily shaded light near the scale. Between trials there 
was practically no change in the illumination so that the subject 
was substantially adapted throughout to the ground glass field. 

Prior to each trial the lenses were turned to a point previously 
determined where all the letters were invisible. A letter was then 
placed on the field at the back of the hood. At the signal “ready,” 
the observer placed his forehead on the head rest and fixated the 
center of the ground glass. The experimenter then turned the 
crank at the rate of one revolution in four seconds, keeping time 
with a metronome. The observer named the letter as soon as he 
recognized it. If his response was correct the experimenter re- 
leased the handle immediately and recorded the reading. If the 
response was incorrect, the experimenter said “No” and continued 
turning the crank until the correct response was given. The in- 
correct response was, of course, subsequently recorded. 

The letters were given in a rendom order with all the cases and 
faces inter-mixed. Each subject worked for two sittings and in 
some cases three, in order to complete the entire series. The 
subjects were frequently asked if they were fatigued and there was 
considerable opportunity for rest between trials while the ex- 
perimenter was recording results and changing material. 

The subjects were seven students taking a laboratory course in 
psychology. None of them had had experimental training other 
than that which they had received in the course. There were five 
males and two females. Five of the seven had previously per- 
formed a simple experiment with a Focal Variator themselves. 
The vision of all was 20-20 or corrected to that by. glasses. 


RESULTS 
The outstanding differences between the three families of letters 
are shown in Table I. The first column gives the observer, the 
second the average reading of that observer for the eighteen letters 















































cus. 
‘rom 
lear- 
hese 
wed 
vers. 
int 

line 
pre- 
rom 


tual 
and 
here 
ject 


usly 
then 
dy,” 
the 
the 
rime 
; he 
re- 
the 
ued 


and 
1 in 





LEGIBILITY OF TYPE FACES 240 


in Bodoni upper case. The next column, the readings for Bodoni 
lower case, and so on. For example observer number one gave an 
average scale reading for the eighteen letters in upper case Bodoni 


TABLE I 

AVERAGE LEGIBILITY OF 18 LETTERS 
Observer Bodoni Baskerville Cheltenham 

D. , L. C. U. C. L. C. U. C. L. C 
I 14.7 11.4 14.6 12.8 17.6 13.1 
II 12.9 10.9 13.6 12.0 16.4 12.9 
III 12.9 10.9 16.5 10.9 17.9 13.4 
IV 12.2 10.7 13.1 10.7 14.4 12.0 
V 13.6 11.2 15.5 12.6 17.4 13.2 
VI 12.7 11.8 15.5 12.4 17.5 13.6 
VII i3.9 11.0 15.8 11.9 17.5 13.3 
Average 13.3 11.1 14.9 11.9 17.0 13.1 


of 14.7 16ths of an inch. His corresponding reading for Bodoni 
lower case was 11.4. For Baskerville upper case 14.6 and so on. 
The bottom row in the table gives the average for the seven ob- 
servers. The same results are shown graphically in Figure 1. 

The abscissa of the curve represents the three faces indicated 
while the ordinate represents the scale readings. Each observer is 
indicated by a different type of curve with the heavy curve giving 
the averages. The nine curves nearest the bottom of the chart 
indicate lower case and the others upper case. 

The general tendency of results is obvious. Cheltenham is more 
legible than Baskerville. Baskerville in turn is more legible than 
Bodoni. The results are more marked with the upper case letters. 
It may be noted incidentally that throughout the upper case is 
more legible than the corresponding !ower case. It should be 
borne in mind, however, that we are here dealing with individual 
letters whereas it is well known that entire words printed in lower 
case are more legible than the same words printed in upper case. 

The results are presented from the standpoint of probable error 
in Table II. 
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TABLE II 
RATIO OF DIFFERENCE TO PROBABLE ERROR OF DIFFERENCE 

Observer Bodoni Bodoni Baskerville 

Baskerville Cheltenham Cheltenham 

U.C L. C U. C. L. C. U. C. L. C 
I 0.2 2.2 1.8 2.8 5.8 0.4 
II 1.4 2.2 6.1 3.9 4.1 1.7 
III 5.2 0 8.6 3.3 1.9 3.2 
IV 2.! 0.2 4.8 3.4 2.8 3.6 
V 3.4 2.7 7.6 4.1 3.2 1.3 
VI 5.0 1.3 8.1 3.0 3.8 2.3 
Vil 4.8 1.9 8.3 5.8 4.3 2.8 
Average 3.2 1.5 6.9 3.8 3.7 2.2 
All cases 7.2 3.0 16.9 9.5 8.8 4.4 


This table gives the ratio of difference to probable error of dif- 
ference for the various comparisons and subjects indicated. For 
example, with subject I the difference between the average of 
Bodoni upper case and the average of Baskerville upper case is 
0.2 the probable error of difference, while for those two faces in 
lower case the difference is 2.2 the probable error. The figures 
for the seven subjects in each column are averaged in the row 
marked “average.” The probable errors were also computed for 
all the letters of a given face and case for all the observers, i. e., 
126 (18x7) readings. The differences between the averages at 
the bottom of Table I may be thus evaluated and in Table II the 
row “all cases,” gives the ratios of these differences to the prob- 
able errors of difference. 


The table indicates that the difference between Bodoni and 
Cheltenham upper case is unquestionably significant, as the 
smallest individual subject’s difference is 4.8 P. E. The differ- 
ences between Baskerville and Cheltenham are not: as marked but 
some of the individual differences are significant and the average 
difference undoubtedly so. The differences between Bodoni and 
Baskerville upper case are of about the same order although not 
quite as large. With the lower case letters the relations are much 
the same as with the upper case but all the differences are some- 
what smaller from the standpoint of probable error. The differ- 
ence between Bodoni and Cheltenham is the most marked, that 
between Baskerville and Cheltenham somewhat less, while that 
between Bodoni and Baskerville lower case is of somewhat doubt- 
ful significance. 

These results seem plausible from analysis of the characteristics 
of the three families. Cheltenham involves relatively heavy strokes 
of fairly uniform, width, Bodoni on the other hand, involves many 
light strokes and this is especially marked with upper case letters. 
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Baskerville likewise involves light strokes though not as light as 
Bodoni. This corresponds to its intermediate position in legibility. 

In addition to the differences between the families established on 
the basis of the average of 18 letters, there may be points of inter- 
est in considering individual letters in the various faces. Table III 
presents data from this standpoint. Each entry in the table rep- 
resents the average of the seven subjects for the letter and face 
indicated. 


TABLE III. 

. Basker- Basker- Chelt- Chelt- 
Bodoni Bodoni ville ville enham enham 

Letter U.C. L. C. U.C. L. C. U.C. L. C. 
A 13.1 9.5 13.9 9.9 16.9 12.4 
B 12.7 13.0 14.1 15.2 15.1 14.9 
C 14.1 10.7 15.8 11.4 18.3 13.2 
D 14.0 12.2 16.3 12.5 18.2 14.5 
E 12.7 9.4 13.1 10.5 16.2 12.2 
F 11.2 11.6 14.2 12.1 15.4 10.6 
G 13.7 10.9 16.5 9.9 18.2 10.1 
H 14.2 12.6 15.6 13.3 17.1 14.7 
I 12.8 10.0 12.2 13.5 13.1 13.5 
M 13.7 13.7 16.9 11.9 17.2 15.8 
N 11.8 9.7 15.9 11.0 17.0 11.9 
O 15.8 12.5 17.0 12.7 20.2 14.5 
P 14.4 13.5 14.8 14.1 17.8 15.4 
R 13.0 11.1 14.1 11.5 16.5 10.5 
Ss 13.5 8.6 12.9 10.3 15.3 11.6 
T 11.7 9.1 14.1 9.5 16.5 11.0 
Vv 12.2 10.9 14.0 11.9 18.1 13.2 
Ww 13.3 11.4 17.1 13.4 17.9 14.8 
Av. 13.2 11.1 14.9 11.9 16.9 13.0 


Obviously the legibility differences vary with the letters. The 
safest way to consider the matter is in terms of probable error. 
This was computed for any differences between means of the same 
letter in different faces that seemed of possible significance. 

The differences between lower case Bodoni and Cheltenham are 
most pronounced with the following letters and the differences are 
all greater than 4.5 the probable error: S, I, V, C, A, E, W. 
Most of these letters involve a considerable portion of light strokes 
which in the Bodoni are of course very light. With lower case 
Baskerville and Cheltenham the ratio is 7.2 for A and 4.4 for C, 
with all the others less than this. With Bodoni and Baskerville 
lower case the ratio for I is 8.9 but the next smaller is 3.7 for P. 
The most noticeable fact regarding the lower case letters is per- 
haps the inferiority of Bodoni or Baskerville A to Cheltenham A. 
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This letter has three light horizontal strokes. 

With the upper case the differences are relatively greater. For 
Bodoni vs. Cheltenham they are significant for practically all 
letters, but most marked in the case of V, N, B and O. With 
Baskerville and Cheltenham, V, E, P, S and R, all have differences 
greater than 4.5 P. E. With Bodoni and Baskerville the ratio for 
N, G and B are greater than 4.5. One noticeable thing is the 
poor showing of Bodoni N. It comprises two very light vertical 
strokes. 

Another method of approach is to consider each letter’s legi- 
bility relative to the others of the same face and case. This 
could be accomplished by ranking the values in each column of 
Table III. The greatest discrepancies between an upper case 
letter’s rank in different faces are as follows: N has relatively 
low legibility in Bodoni compared with Baskerville. This fact 
has been noted above: S however ranks much higher in Bodoni 
than in Baskerville. V stands high in Cheltenham but low in the 
other two. With lower case letters the following discrepancies are 
large. Baskerville I is relatively high in legibility and Bodoni I 
relatively low. Bodoni F is relatively high and Cheltenham F, 
relatively low. Cheltenham and Bedoni M are the most legible 
in three faces while Baskerville is only average. 

Jn general there is a somewhat closer correspondence in the 
relative legibility of the letters in the three faeces with lower 
ease than with upper case. Correlations were computed (rank 
differences squared) between the different columns of Table III. 
For the upper case the correlations are: Bondoni-Baskerville .47, 
Bodoni-Cheltenham .74, Baskerville-Cheltenham .63. For the lower 
case the corresponding figures are .73, .70, and .73, indicating 
closer correspondence in the case of the lower case letters. It 
would seem that upper case letters afford a better opportunity 
for face differences in legibility to manifest themselves. 

A record was kept of all cases in which a letter was mistaken 
for another. These were tabulated to show any common ten- 
dencies. No error was considered of interest unless made by at 
least two subjects. The following confusions occurred in all three 
faces: lower case E read C; lower case H read B; upper case I 
read L; upper case R read K. The following confusions occurred 
in two faces, lower case; F read I, T read I; and the following in 
two faces, upper case: B read C; M read W; P read F. There 
was nothing in the analysis of results from this standpoint to 
clearly indicate differences in face. 

SUMMARY 

A study was made of the comparative legibility of Bodoni, Bas- 

kerville Roman and Cheltenham Medium type faces. The legibility 
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of individual letters was determined separately using upper and 
lower case and fourteen point. The letters were presented in 
a focal variator and evaluated in terms of the distance they could 
be thrown out of focus (maintaining constant size) and still be 
recognized. 

Greatest legibility was found on the average for Cheltenham, 
followed by Baskerville, with Bodoni the least legible. This was 
true for both upper case and lower case but the differences were 
more striking from the standpoint of probable error with the 
upper case. These differences seem plausible upon examination 
of the three faces, particularly with refenence to the width of 
strokes. Study of individual letters showed a tendency for the 
greatest differences between faces to occur with letters involving 
light strokes, inasmuch as these are particularly light in the 
Bodoni and Baskerville. 








A COMBINATION MENTAL TEST FOR CLINICAL 
USE 


By GRACE H. KENT 
Worcester State Hospital, Worcester, Mass. 


I. LIMITATIONS OF STANDARD TESTS, FOR DIAGNOSTIC 
PURPOSES. 


The serviceability of mental tests for tentative classification of 
large groups of persons has been established beyond reasonable 
doubt and there are many tests in common use, both individual 
and group, which are sufficiently accurate for such purposes as 
classification of army recruits, school children, or factory em- 
ployees. But in the clinic no mechanically applied test should be 
expected to take the place of patient and painstaking individual 
study. Used as an instrument for such study the standard test 
may be a valuable aid; but it is of questionable value if we accept 
the findings as final and thus make it serve as a substitute for 
careful observation. 

The concept “mental age” is an exceedingly convenient one and 
it is not unnatural that a servant so obliging should be expected 
to work over-time. Because the mental age can be so easily stated, 
it is almost inevitable that its significance should be exaggerated. 
Thus we find a tendency to speak of a child’s mental age as if it 
were as definitely and finally determinable as his life age and there 
is danger that the qualitative findings of the test—which may be 
vastly more significant than any results that can be stated in fig- 
ures—will be wholly overlooked. 

An unavoidable source of error in the use of a norm is that it is 
impossible to give due recognition to the variations in the indi- 
vidual records from which the norm is constructed. The norm may 
be derived from a sufficiently large number of representative per- 
sons, but the individual variations still remain and the limits of 
possible normality lie close to the extremes rather than near the 
average or median.’ 





1 plead guilty to the failure to give due weight to this factor. In the 
interpretation of the data which formed the basis of the Kent-Rosanoff 
association test, published in 1910, 1 was so deeply impressed by the differ 
ences which the test indicated between normal and insane subjects that I 
did not fully appreciate the importance of the similarities thus indicated. 
The average number of “individual reactions” for normal subjects was seven 
per cent of all reactions, and the average for’ insane subjects was twenty 
seven per cent. This is obviously a significant difference, but we cannot 
determine what it signifies for a given individual. The range of variation 
was very wide for both classes of subjects, being from zero to ninety per 
cent for the normal subjects and from five to ninety-seven per cent for the 
insane subjects. There was thus an overlapping amounting to eighty-five 
per cent between individuals of the two classes, and it is plain that no 
conclusions should be drawn about any individual on the basis of a test 
which yields such variable results. 
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Another source of error which should be kept in mind is the 
personal equation of the examiner. This factor is well eliminated 
in many group tests by reducing the administration of the tests and 
the scoring of results to a purely mechanical process. But it cannot 
be so easily overcome in individual tests because much depends 
upon the inflection with which the examiner reads _a question, and 
upon the spirit and interest with which he presents a problem. 
The test may be so fatiguing to the examiner that he is unable in 
the latter part of the working day to do full justice either to the 
test or the subject. And in spite of the scoring cards and minute 
instructions which add more to the difficulty than to the accuracy 
of scoring, there remains considerable room for difference of opin- 
ion, especially in tests calling for involved verbal responses. 


In order to be valid for adult subjects a test should be based in 
part upon faculties that develop with increasing maturity and 
experience as opposed to those faculties that reach their zenith 
during the high school period and fall into disuse after school life 
is completed. If we are to depend largely upon tests for diagnosis, 
we should have a test that would show in what way a deficient man 
having a “mental age” of nine years differs from a normal boy of 
nine. We know that the man lacks the developmental possibilities 
which constitute the boy’s chief asset to society. We know also 
that the man, by reason of his longer experience and more settled 
habits, can do many things which the boy cannot do. We indicate 
the man’s inferiority to the boy by assigning an intelligence quo- 
tient of fifty-six; but our criteria for determining his intellectual 
status are not adequate unless we can show also in what ways he 
is superior to the boy. Determination of the mental ability which 
is measured by the Binet scale, call it what we please, does not tell 
us what we need most to know about a clinical subject. The man of 
thirty who gives a nine-year mental rating may be a useful and 
trustworthy laborer, or he may be a serious menace to society. 
Two thirty-year men of nine-year mentality, equally free from de- 
linquent tendencies, may differ very widely in their capacity for 
work, in their adaptability to different kinds of work, and in the 
amount of supervision required. The fact that they give the same 
mental rating is relatively unimportant since the proper disposal 
of their cases depends upon distinctions which are not shown by 
the test. 


It is difficult to over-state the possible errors involved in the use 
of a verbal test for subjects who do not understand the language 
in which the test is given. It is true, under certain conditions, that 
a child’s ability to acquire language may furnish a fair index to his 
intelligence, but we cannot make this assumption in a given case 
unless we know that the subject has had adequate opportunity to 
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acquire the language. I recently tested many children in a mill 
village of which the foreign population is in the majority. The 
children use their own language at play, even on the school grounds, 
and many of them have rarely heard a word of English outside of 
the school room. To assign a mental age based on such showing as 
they can make in a verbal test would seem little short of perjury. 
Performance tests are useful in such cases, but more for qualita- 
tive interpretation than for mental rating. 

A test stands on its own merits, regardless of norms, since the 
findings obtained by an unstandardized test that is well suited to 
the subject are more significant than a mental rating derived from 
an ill-adapted test. Dr. Walter E. Fernald’s questions on “Prac- 
tical knowledge” possess certain advantages over any standard- 
ized test known to me. Although they cover a wide range of 
difficulty, these questions can be asked casually and answered 
simply, usually in a very few words. They relate largely to mat- 
ters of local interest, and the child may easily assume that the 
examiner is seeking information on the matter inquired about. 
There is no definite programme for asking the questions, as for 
administering a more formal test, so the examiner is free to omit 
any question which does not fit the case and which might make the 
subject feel strained and unnatural. I should like to see these 
questions standardized, but not unless it could be done without de- 
stroying the elasticity of the system. For although a norm is a 
great convenience to the examiner, it is not essential to the serv- 
iceability of the test. The age value represented by a given question 
or task is empirical, not absolute, and the examiner’s own experi- 
ence ought not to be wholly at a discount. When a test—even a 
standard test—is used by an experienced observer, I question the 
advantage of rigid adherence to rules. 

No test can be considered valid except on the assumption that 
the subject answers each question as well as he can. But it fre- 
quently happens that a subject who is brought to the clinic for 
examination comes very unwillingly and with no intention of giving 
any information about himself. The tactful examiner can usually 
obtain some measure of responsiveness, but it is impossible to state 
the exact degree of co-operation received. The lack of co-operation 
cannot be estimated in quantitative terms, while its effect on the 
mental rating is invariably quantitative. 


Among insane subjects one meets with all possible degrees of 
co-operation. Some patients put forth their maximal effort in the 
hope of proving their fitness for life in society, some offer violent 
resistance, and some give no response at all. There are many cases 
in which the qualitative findings are of considerable interest, but 
it is exceptional to find a patient for whom the mental rating, as 
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such, has any significance even as a matter of record, still less for 
diagnosis or treatment. When the rating is high enough to ex- 
clude mental deficiency, this diagnosis is almost certain to be 
excluded also on other grounds. A low rating may indicate defi- 
ciency, deterioration, or poor co-operation. Extremely wide scat- 
tering of success and failure is usually associated with a psychotic 
condition, but it may indicate merely that some of the problems 
presented are interesting enough to command the co-operation of 
the subject while others are beneath his notice. 

The examiner who administers the test usually learns something 
about the case but not necessarily anything that can be stated sta- 
tistically. In many cases it would be grossly misleading to assign 
a mental age. 

Il. INHERENT WEAKNESS OF THE STANFORD 
REVISION, FOR CLINICAL USE. 

No single system of tests is equally well adapted to all classes 
of subjects. The Binet scale is especially adapted to children of 
normal mentality, and in this field I can agree with Terman that. 
“In the large majority of cases persistent silence deserves to be 
scored failure.” School children are in general the most responsive 
subjects I have ever had occasion to test, not excepting psycholog- 
ically trained university students, and the matter of obtaining their 
co-operation is not a serious problem. 

But this does not hold for clinical subjects, even when they are 
tested in school. The mental measurement of defectives requires 
a much finer technique than is necessary for the classification of 
normal children, and at best there is serious danger of rating a 
child too low. The younger children who are brought to the school 
clinic are apt to be timid in the presence of a stranger, and the 
older ones are usually suspicious and sensitive. Defective speech 
is very common, frequently serious enough to make the child shy 
and self-conscious, and sometimes so marked that his answers are 
unintelligible. Deficient children are very susceptible to fatigue, 
and the examination is. usually too long, except for a child rating 
very low, to be made properly at one sitting. : 

When the external conditions are favorable and when there is 
no lack of time, I find the lower end of the Standford scale fairly 
satisfactory for testing English-speaking children who are free 
from speech defect. If a child is too young to realize that he is 
being tested for mental deficiency, it is merely a question of time 
to overcome his diffidence and to gain his confidence. He regards 
the test as a school exercise, and responds to the examiner as well 
as to a new teacher. 


Many of the difficulties encountered in the clinical use of the 
scale are clearly recognized by Terman and cited as exceptional 
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situations which should be met by some special mode of procedure.’ 
But in the clinic these difficult situations are the rule rather than 
the exception, and the means suggested for overcoming them are 
usually impracticable. Under ordinary clinical conditions, as when 
the children are brought from a distance or when they are exam- 
ined in school by a visiting psychologist, it may be impossible to 
take the child out for a walk in order to establish a friendly feel- 
ing, to allow two sittings for an examination, or to resort to any 
of the measures prompted by common sense. A full “Out-Patient 
Day” is not conducive to accurate work. The examination is us- 
ually too hurried to be satisfactory, and after four or five equally 
unsatisfactory tests performed in quick succession the examiner 
may be too tired to take proper interest in subsequent tests. 


My principal objection to the lower half of the scale when used 
for juvenile subjects is that it requires conditions more perfect than 
usually obtain in the clinic—more perfect than I have ever seen 
in any clinic. But for any subject able to go above the eight-year 
level and for an adult subject of any grade of mentality, the de- 
fects of the scale are more serious and more fundamental. 


Perhaps the most delicate situation which we meet in the psy- 
chological clinic is that presented by the examination of a self- 
supporting young man or woman of border-zone intelligence, who 
requires supervisory care because of some problem relating to con- 
duct. The subject may be painfully aware of his intellectual short- 
comings, and the humiliation of being tested for mental deficiency 
is almost more than he can bear. If the test is to be valid, we should 
be able to offer something sufficiently interesting to draw his atten- 
tion away from himself as a subject-who-is-suspected-of-being- 
feebleminded. But the tasks of the Stanford scale are strongly 
conducive to self-consciousness, especially the questions relating to 
school life. It is little less than inhuman to require a serious 
answer to the question “What should you do if in danger of being 
late to school?” 


The vocabulary test furnishes a good illustration of a test that 
is well adapted to school children but very poorly adapted to the 
clinic. A bright boy may regard it as a joke when his limit is 
reached, and I have never seen a normal child manifest serious 
annoyance over the series of failures with which the test is invari- 
ably brought to a close. But these failures are disheartening to 
the sensitive clinical subject, and the situation becomes more tense 
with each succeeding question. The test is, therefore, unduly 
fatiguing both to subject and examiner. 





*The Measurement of Intelligence, pages 124 to 135. 
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It is not necessary to consider each question individually.* The 
fundamental werkness of the system is that too little account is 
taken of the aff ctive reaction of the subject. Terman assumes, as 
did Binet, that the questions can be presented in such a way as to 
arouse the interest of the subject. But even if so, it is a serious 
defect of the system, for clinical purposes, that so much depends 
upon the presentation. As most of the subject matter is of no 
intrinsic interest to subnormal adults and adolescents, the resources 
of the examiner are severely taxed in the effort to make the test 
attractive enough to hold their attention. And the rules for the 
administration of the test, when they are followed literally, make 
the examination unduly lengthy and tiresome. The examiner is 
expected to cover one year wholly within the range of the subject’s 
ability and one year wholly beyond his range, thus including at 
least twelve questions that are frankly nondiscriminative for him, 
besides many more that are probably so. It is these nondiscrimi- 
native questions that are most disturbing to the subject’s emotional 
equilibrium. He is annoyed and discouraged by those that he can- 
not answer and frequently feels insulted by those which are so 
easy as to require no thought. 

It is of no small importance, in the interests of accuracy, that 
a clinical test should be easily and comfortably administered. At 
best a day’s work in the clinic is usually too exhausting to be 
done as it ought to be done. One who is engaged in collecting 
material for a new test of his own devising may well think noth- 
ing of working eight or even ten hours at high tension, because 
all thought of fatigue is overshadowed by the exhileration of 
watching the returns come in. But it is very different to spend 
a whole day making routine use of that same test after it has 
been published, and still more different to spend a day making 
routine application of a test for which one has no personal af- 
fection. Much of the routine testing in clinics is done by persons 
of comparatively little training, who have no vital interest in the 
results and who are not qualified to make constructive use of 
their observations. And if their work is made unreasonably dif- 
ficult, it is conceivable that some of them will lighten the burden 
by taking unwarrantable liberties with the technique. 

Ill. SUGGESTED CRITERIA FOR CLINICAL TESTS. 

It is much easier to point out defects in the tests we are using 
than to devise methods for correcting them. Most of my recom- 
mendations are negative, but it can do no harm to call attention to 
the particular needs of the clinic. 

1. The task or problem should possess sufficient intrinsic in- 





®*The inappropriateness of the Stanford Revision subject matter, as applied 
to adult subjects, has been discussed by Wells and Kelly, /ntelligence and 
Paychosia, American Journal of Insanity, Jaly, 1920. 
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terest so that the subject will usually regard successful perform- 
ance as an end in itself. Spontaneous co-operation on the part 
of the majority of subjects should be considered essential. It is 
not to be expected that any test will meet with universal approval 
among clinical subjects, but there should be no tasks which are 
commonly performed only in response to urging. 


2. Any test used for examination of adult subjects of what- 
ever grade of mentality should possess sufficient dignity so that 
it can be offered to a mature, self-respecting man without insult- 
ing him. Questions should relate to politics, travel, the industrial 
world and other subjects of universal interest rather than to school 
life. All material that is essentially puerile should be rigidly ex- 
cluded. 

38. The requirements for each task should be simple and easily 
understood so that it*may not be necessary to tax the attention 
of the subject by a long and involved explanation. Repetition 
should not be necessary and the time required for presentation 
should be negligible. 

4. Questions which call for oral response, if used at all, should 
be satisfied by very simple answers. 

5. The administration of the test should not be so exhausting 
as to make it impossible for an examiner in ordinary health to 
put in a full day’s work of evenly good quality. Progressive de- 
terioration in the quality of presentation should be avoided by 
the use of tests that can be presented with the minimal amount of 
actual labor. There should be very few tests, if any, which re- 
quire the reading aloud of long paragraphs, and very few per- 
formance tests which call for close observation of the subject’s 
movements. No tests should require hair-splitting distinctions as 
to what constitutes an erroneous movement. 

6. Time limits are necessary in order to prevent an examination 
from being too long and tedious. But the time allowance in most 
tests should be liberal enough so that the subject will not be con- 
scious of working under a time limit. Many normal persons are 
unable to do themselves full justice when working under high 
pressure, and it is unfair to class a person as deficient primarily 
on the ground that he becomes disconcerted when hurried. 

7. Tests which call for reading knowledge on the part of the 
subject have a relatively small place in the clinic. Adult subjects 
who can read a little are usually very sensitive about their in- 
ability to read fluently, and the exposure of the deficiency in read- 
ing knowledge may be so disturbing emotionally as to invalidate 
the rest of the examination. Also, there are many clinical sub- 
jects who cannot read at all. 

8. Tests which elicit obvious failures should be avoided as far 
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as possible. Such errors as are necessary to make a test dis- 
criminative should be interspersed among successful performances, 
so that the subject will not be discouraged by the sense of failure. 
Tests in which the subject will be wholly unaware of his errors 
are especially valuable. (Healy Pictorial Completion II furnishes 
proof that this is not impossible. A subject may make ten er- 
roneous placings which net a minus score, and yet be perfectly 
satisfied with the result). 

9. <A scale of tests intended for clinical use should be of rather 
loose construction, so that thoroughness of the examination may 
not be dependent upon any particular test or group of tests. The 
system of scoring should permit giving the subject a rating on 
tests that are well suited to his command of language and his 
ability in general. It should not be necessary to give him a test 
that is hopelessly beyond him merely to prove that he cannot pass; 
nor, on the other hand, to give a test that he will take as an insult 
to his intelligence merely to prove that he can pass. The general 
plan of standardization for performance tests in which success or 
failure is obvious to the subject should be such that the examiner 
may permit or assist a subject to finish a task after the time limit 
is passed, so that the subject need not know of his failure to pass 
the test. And the rules for administration should be liberal enough 
to permit the examiner to select tests in accordance with the in- 
terests and ability of the subject. 

Other criteria, of course, are essential. But it is not necessary 
to enumerate here the various standards to which educational as 
well as clinical tests are expected to conform. 

The development of a scale which shall satisfy these require- 
ments is a problem for collective rather than individual effort. 
However, individual contributions are not wholly at a discount, 
and at worst the problem is difficult enough to be intensely inter- 
esting. 

IV. A TENTATIVE COMBINATION SCALE 

This is offered somewhat prematurely, as a preliminary study, in 
the hope that it will encourage the splitting of composite educa- 
tional tests into their component parts and the publication of inde- 
pendent norms for the separate units. 

This plan for using a miscellaneous group of tests as a sub- 
stitute for a standard scale is essentially the method suggested 
by Pintner and Paterson‘ for deriving a mental rating from a 
group of tests that were standardized more or less independently. 

It seems to me that we can make a closer approach to the true 
mental age—if there be any such thing—by basing the estimate 
upon the median of a number of independent mental ratings than 





‘4 Scale of Performance Tests, chapter VI. 
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by accepting as final the rating derived from any single system 
of tests. I believe further that it is an advantage, other things 
being equal, to use tests drawn from different sources, developed 
by different authors, and standardized by results obtained from 
different sets of children. In this way the total number of normal 
subjects contributing to the norms is greatly increased, and certain 
constant errors, such as the personal bias of the experimenter, 
tend to be counteracted. 

It is only as a tentative measure that I am willing to accept the 
term “mental age” as the end result of the series of tests. I con- 
sider the term absurd as applied to adults, and believe it to be 
frequently misleading as applied to children. It is far too readily 
understood, and is therefore used loosely by persons without psy- 
chological training. Worst of all there is a strong tendency to 
pass it around carelessly among workers in the clinic, sometimes 
within the hearing of a subject who is capable of comprehending 
its meaning. But although I strongly disapprove the use of this 
term and greatly prefer the percentile method, I find it necessary 
to express results of different tests in a term that will serve as a 
common denominator, and “mental age” is unfortunately the most 
convenient term for this purpose. My present aim, therefore, is to 
collect a large number of short-time tests which yield independent 
mental age ratings. 

An abbreviated form of the Stanford scale—usually two or oc- 
casionally three items for each year—furnishes one unit of the 
series. Healy Pictorial Completion II is an exceptionally valuable 
test for English-speaking subjects of high grade. Of the fifteen 
tests of the Pintner-Paterson performance scale, I find the fol- 
lowing especially useful: Mare and Foal Picture Board, Seguin 
Form Board, Five Figure Board, Two Figure Board, Casuist Form 
Board, Triangle Test, Diagonal Test, Ship Test and Substitution 
Test. Other tests which I am using already or expect soon to add 
to my collection are as follows: Kohs Block Design Test, Holley 
Vocabulary Test, Ferguson Form Boards, Cube Construction Test, 
Dearborn Form Board, Pressey Primary, Pressey Intermediate, 
Otis Intermediate, and selected tests from the Herring Revision of 
the Binet-Simon Scale.* 


The essential feature of the programme is that each task pre- 
sented to a subject can be selected with special reference to the 
interests and ability of this particular subject. Any test which 
does not elicit good co-operation is to be dropped, and the record, 
even if complete, is not to be counted in the series. 





‘IT wish to express my gratitude to Dr. J. P. Herring, who has supplied 
me with records from which I can compute age norms for any tests which 
I wish to use. 
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The tests used in any examination should be as varied as the 
subject’s ability and interests permit, and it is important to have 
a larger collection from which to draw, especially of verbal tests. 
If it were necessary to choose between language tests and perform- 
ance tests I should give the preference to performance tests, be- 
cause nearly all clinical subjects co-operate more readily in these 
tests than in tests requiring an oral response, and because there 
are many subjects who cannot be reached at all by language tests. 
But performance tests are limited in their scope, and many of them 
are highly susceptible to the influence of chance. They do not take 
the place of language tests, for subjects who have fair command 
of the language and the frame-work of the method will not be 
complete until verbal tests are available in much greater variety. 
The verbal-performance tests which are now being used extensively 
in schools offer a wealth of material for this purpose. The Ter- 
man Group Test, for example, contains several units which would 
be exceedingly useful in the clinic, given as individual tests. It 
will be necessary, of course, to compute norms for the separate 
units before they can be used in this scheme. 

It is my plan to use in approximately equal proportions tests 
which depend upon accuracy and tests which are scored by speed. 
While it is of course impossible to distinguish between these two 
variables in a strict sense, I think we are justified in disregarding 
one or the other in practice. In those performance tests in which 
the performance leaves no record of the errors, the difficulty of 
counting the subject’s moves introduces a source of inaccuracy 
which it is particularly desirable to eliminate, since there is no 
way of checking up the examiner’s count. In the Pintner-Paterson 
tests which I am using most freely, the error curve runs almost 
parallel to the time curve. It is true that the time record does not 
tell the whole story, but I cannot see that the count of moves or 
errors adds anything to it, and I think the examiner’s time can 
be spent more profitably taking notes for qualitative study of the 
subject’s mode of procedure. In giving the Kohs test recently to 
adult subjects of very superior ability, I observed that some toyed 
aimlessly with the blocks, thus making many useless moves; some 
worked by trial and error, moving so rapidly that I was unable 
to count the moves with even an approach to accuracy; and some 
worked very deliberately, with almost no false movements. All 
gave high scores, and the number of moves depended, apparently, 
more upon temperament than upon the ability measured by the 
test. 


*I am indebted to Dr. Luella Cole Pressey for providing me with age 
norms for the four parts of the Pressey Primary test. This furnishes four 
excellent verbal-performance tests for subjects who understand English 
but who cannot read. 
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The mental ratings obtained from a given subject by different 
tests usually vary quite widely, sometimes ranging from sixteen 
years to zero. This may be due in part to the influence of chance 
upon some of the mechanical performance tests. But as the rat- 
ings which are affected by chance tend naturally to approach the 
extremes of the series, they do not invalidate the final result. Ina 
series of fifteen tests it is not uncommon to find the middle four 
or five figures almost identical. If the ratings are fairly uniform 
I consider a series of ten tests sufficient, and I have occasionally 
assigned a mental age on the basis of nine tests. Ordinarily I 
prefer to make at least twelve tests. 

The mental age as finally recorded is the median mental rating 
of all the tests—never less than nine—which are clearly discrimi- 
native for the subject, exclusive of any tests in which he does not 
give good co-operation and any tests which are too easy or too 
difficult to measure his ability. In the Manikin test, for instance, 
the highest possible score has a mental age value of eight years. 
If this record were counted in the series of a subject rating over 
eight years, the median would be lowered one point, and it would 
be absurd thus to penalize a perfect record. A test so difficult as 
to yield a zero score would have exactly the same effect upon the 
median, and it would be possible in testing a subject of low grade 
mentality, to give enough too-difficult tests to bring the median 
down to zero. A zero score does not necessarily prove; the test to 
be non-diccriminative, but I count the test too difficult to be discrim- 
inative when the subject works persistently up to the time limit 
without making any appreciable progress toward success. The 
examiner’s errors in selecting the tests ought not to be counted 
against the subject’s mentality, and it is clearly unfair to include 
these records. 

By beginning with a test having a wide range of discriminative 
sapacity and gauging the subject’s ability by the results of the 
first test, it is usually possible to avoid giving non-discrimina- 
tive tests. For a subject of low grade mentality I usually 
begin with the Seguin form board or the Mare and Foal picture 
board. For a subject of high grade the Kohs Block Design test 
is very satisfactory for the first presentation, because of its wide 
range of applicability and especially because of its unusual attrac- 
tiveness as a toy. 
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SUMMARY 
An indefinite series of short-time mental tests, verbal and per- 
formance, is recommended for use in the clinic as a substitute for 
a fixed scale of tests. Any short test for which an independent 
age norm is available may be used in the series, and the mental 
age is derived by taking the median rating of ten to fifteen tests 
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which are clearly discriminative for the ability of the subject. 

Many composite educational tests which are not suitable for 
clinical use in their entirety contain parts that would be highly 
valuable as single units of a series. The publication of norms for 
small homogeneous units, in addition to the norms regularly pub- 
lished, would be conducive to the wider usefulness of tests intended 
for educational surveys. 

The following advantages of this method over the Stanford Re- 
vision have been observed: 

1. The method is more elastic than a fixed scale, more adapt- 
able to the interests of the subject. By selecting tests that are in 
line with his interests, it is possible to obtain better co-operation. 

2. The method is adaptable also to the special qualifications of 
the examiner for administering tests, and the routine work of 
testing is thus made more attractive and less fatiguing. 

3. The method is less mechanical. By allowing more room for 
common sense in dealing with a subject, it is conducive to more 
careful observation of the patient as an individual. 

4. By placing less responsibility upon the examiner for the ob- 
servance of petty rules, it reduces the chances that undue liber- 
ties will be taken with the technique of examination. 

5. If the tests are well selected, it allows less play of personal 
equation in the scoring of results. 

6. It is more economical of time. The period is spent upon 
tests that really test the ability of the subject in a given direction, 
and no considerable amount of time is wasted upon tests that are 
either too easy or too difficult to show definite findings. There is 
also an absolute saving in the time required for testing a subject 
of high grade mentality. 

7. The validity of the results is not dependent upon the sub- 
ject’s co-operation in every test presented or in any particular test. 
If he can be induced to co-operate in any ten tests, it does not 
matter, for the validity of the test, how many questions he refuses 
to answer. 
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FRESHMAN TESTS IN THE SMALL COLLEGE’ 


By Evste Murray, 
Wells College 


In this era of rapidly expanding college populations and chang- 
ing standards in entrance requirements and curricula, the uses of 
a freshman intelligence test are manifold. In the smaller college 
admitting by certificate only, one may desire a criterion by which 
to gauge the quality of successive entering classes of entrants 
with and without Latin; or of students in the college in point with 
those in other institutions. The test findings are also utilizable in 
connection with student guidance, to detect special abilities or dis- 
abilities, and to throw light upon cases of academic failure. There 
is, besides, the possibility of discovering a test combination which 
may later prove of value in the selection of desirable (or the elim- 
ination of undesirable) candidates for admission. 

With all of these objectives more or less in view, a freshman 
group intelligence test was given at Sweet Briar College, in the 
fall of three successive years: November 20, 1919, to 149 fresh- 
men (class of 1923); October 26, 1920, to 116 freshmen (1924); 
and 29 seniors (1921); October 19, 1921, to 116 freshmen (1925) 
and 32 seniors (1922). The blank used was the Thurstone Psy- 
chological Examination for College Freshmen, in ‘its 1919 form 
in 1919, and in the 1920 form in 1920. This test, which is ar- 
ranged in ‘eycle-omnibus’ form, with a single time limit, requires 
only 30 minutes, and calls for no formal oral instructions. It was 
administered simultaneously to groups of about 40 each, by in 
structors coached by the writer. 

Out of the various possibilities available in 1919, this particu- 
lar blank was selected for trial, in part because of ease of admin- 
istration, and economy of time and expense; largely, however, be- 
cause the types of problem represented (analogies, information, 
matching of abstract statements, judgment on the truth or falsity 
of arguments, sentence completion, completion of number series, 
in the 1919 edition, the same with the substitution of an extra word 
test, and arithmetical problems for the two last mentioned, in the 
1920 edition), had previously proved highly reliable in the writer’s 
testing of college women (juniors and seniors).* As compared with 
Army Alpha, the Thurstone combination seemed better adapted to 
call out genuine and sustained thinking ability on the part of the 


1For aid and co-operation in the gathering of data for this paper thanks 
are due especially to President Emilie MceVea, and Dr. Mary Harley, of 
Sweet Briar College. 

“EK. Murray: Psychological Tests as Diagnostic of Vocational Aptitudes 
in College Women, Journal of Applied Psychology, March, 1920, pp. 33-34. 
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student tested. In the long run, we must admit, the sentence com- 
pletion test in the 1919 edition, and the arithmetical problems of 
the 1920, were found to constitute a serious obstacle to cur use of 
this test. Not only does the substitution of the latter problems for 
the former invalidate the comparative,value of test scores in suc- 
cessive years, but the wide variations in the possible solutions of the 
former (of quite different qualitative value), prevent any con- 
stancy in scoring from year to year. As regards the arithmetical 
problems, many students finding them to consume a dispropor- 
tionate amount of time, simply omitted them entirely, thus running 
up relatively high scores for speed (problems correct) as com- 
pared with those of more conscientious individuals. 
RAW RESULTS. TREATMENT OF SCORES. DISTRIBUTION 

Of the 443 students tested, only three covered the blank (168 
problems in the 1919 edition, 182 in the 1920) within the time 
limit set. Among the seniors, no one scored higher than 152 cor- 
rect out of 168, or 144 out of 182 among the freshmen. The test is 
therefore approximately long and hard enough for the group con- 
cerned, in the time allotted. 

The pedagogical difficulty of handling students of widely vary- 
ing mental calibre in a single section is a commonplace in educa- 
tion. In view of this, the range of scores within each freshman 
group is worth remark. The speed and alertness indicated by the 
upper scores (in problems correct and problems covered) is two to 
three times that represented by the lowest scores (see Table I). 
The range in proportion of errors and omissions to problems cor- 
rect, as measured by percentage accuracy, is also striking, though 
less extended (56 to 96% for 1923). 

The importance of calculating the percentage accuracy as well as 
the number of problems correct was first suggested by the fact 
that certain of the problems are couched in such a form that mere 
guesswork may yield a considerable number of successes; secondly, 
by scrutiny of the academic records of individuals in whom a 
relatively good score in speed or problems correct was observed to 
be accompanied by a relatively high proportion of errors. In 
many such cases, academic success or failure was found to bear 
a closer relation to the score in accuracy (number correct divided 
by number covered) than to the score in problems correct. E., g., 
in 1923, in more than half of the cases of striking dislocation be- 
tween decile rank in grades and in tests, the discrepancy is con- 
siderably reduced by substituting rank in accuracy for rank in 
problems correct (see Table VIII). It seems not unlikely that 
while the latter may constitute the better measure of mental alert- 
ness and agility, in certain fields, the former (percentage accuracy) 
is a truer gauge of all-round ability, or at least of certain character 
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traits-stability, perseverance, effort, ambition, and the related 
qualities, which enter so largely into the determination of academic 
success. 

Accordingly, after considerable experimentation with scores and 
correlation coefficients, the method was adopted of computing an 
‘adjusted score,’ by subtracting the sum of the errors and omis 
sions from the number of successes. The coefficient for these 
‘adjusted scores’ and freshman grades ranges from +.42 to +.44 
for the different classes, and is always higher than that for the 
speed scores (e.g., for 1924, the correlation for academic grades 
and number of problems correct is only .375; for grades and ad- 
justed scores, .42). Each class was then split into ten equal sec- 
tions or deciles, on the basis of this score, those in the highest 
decile given a rank of 10, those in the lowest of 1, ete. Decile 
ranks were similarly determined for each student both in problems 
correct, and in accuracy (later, in faculty estimates and in grades) 
and a copy of the whole placed on file in the registrar’s office for 
consultation by instructors and faculty advisers who desired 
further light on backward or promising students. 

This adjusted score, it is obvious, raises somewhat the standing 
of the ‘slow but sure’ individual, (over the fairness of whose 
rating in a speeded test those unfamiliar with mental measure- 
ments feel such concern), as it also lowers somewhat the rank of 
the slap-dash, swift but careless; and there becomes thus available 
a set of three score rankings, which furnish jointly a more satis- 
factory mental picture of the individual than any one taken singly. 
The usual combinations of swift and inaccurate, slow and accurate, 
swift and accurate, and slow and inaccurate, obtain. While decile 
combinations of 6-1-3, 1-9-4, etc., for speed, accuracy and adjusted 
score. are not uncommon, and the correlation coefficient for speed 
and accuracy in 1923 was only .37%4, there are also noteworthy 
cases of 10-10-10 and 1-1-1 (see Table II). 

COMPARISON WITH FRESHMEN SCORES IN OTHER COLLEGES 

The average score in problems correct for 1923 is 85.4. This 
falls a trifle below the corresponding averages quoted by Thurs- 
tone® for Liberal Arts and Engineering freshmen—86.6 and 88.8, 
respectively. The average for 1925, using the same test form, 

‘LL. L. Thurstone: 1 Cycle-Omnibus Intellivence Tes Journal of Eduea 
tional Research, Nov. 1921, p. 265. The range of : rages quoted for 54 


Liberal Arts colleges runs from 61.6 to 107: for 43 Engineering Schools 
from 7) to 111.4 
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TABLE I 
DISTRIBUTION OF SCORES IN FRESHMAN TESTS, 443 SUBJECTS 









Freshmen Seniors 
Class 1923 1924 1925 1921 1922 
No. Students 149 116 116 29 32 







No. Problems 168* 182** 168* 182** 168 









Highest Score 168 169 158 182 168 No. Problems 







75 Percentile 119 122 126.5 Covered 
Median 108 107 113.5 119 138 
25 Percentile 94 93 98.5 






Lowest Score 63 60 62 83 105 















H 168 169 155 176 167 Attempted 
Median 102 104 111 115 36 
61 ¢ 
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91 
80 
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25 m5 @5 7 87 99 

L 49 12 53 53 84 

H 96 92 95 96.5 94.3 Percentage 
75 84 84 87 88 88 Accuracy 
M 80 76.5 81 82 84.8 

25 73 71 76 76 80 

| 56 43 60 55 62.5 

H 118 122 124 131 136 Adjusted Score 
7d 78 68 82.5 99 103.5 

M 60 55 67 78 35.5 

25 45 42 55 59 "G 

L ‘ 12 —23 30 10 42 

* 


1919 edition of test. 


** 1920 edition of test. 
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rises to 90, a figure slightly higher than either of Thurstone’s 
averages. There is, however, a weak spot in these comparisons,— 
the Sentence Completion test, which occurs in the 1919 edition, and 
the scoring of which it has proved impossible to standardize fairly. 
MEASUREMENT OF IMPROVEMENT IN SUCCESSIVE ENTERING CLASSES 
In the use of the tests to measure the mental calibre of succes- 
sive freshman classes, the scoring difficulty above mentioned again 
arises. However, the scoring standard for sentence completion in 
1921 was apparently more severe, operating if at all to lower the 
score for the class of 1925, since the median number of correct 
answers for this type of problem remains stationary for this 
class, while the medians for all other problems rise a point or two. 
TABLE II 
BEST RECORDS WORST RECORDS 

No. Adjusted No. Adjusted 

Correct Accuracy Score Correct Accuracy Score 

1923 Seore 120 96% 115 52 56.5% 12 

Decile Rk. 10 10 10 1 


1924 135 91 122 
10 10 10 


1921 (Senior) 136 96.5 131 
10 10 10 


1925 131 95 124 
10 10 10 


1922 (Senior) 152 90.5 136 105 46 
A second hitch arose from the altered form of the test offered in 
1920, the arithmetical problems of which proved a stumbling block 
to our students, many of whom possess a veritable ‘number 
phobia.’ The median score for this year drops in consequence 3 
points instead of rising, although the general academic record of 
this class is above that of 1923. A certain degree of improvement 
in 1925 as compared with 1923 is, however, clearly discernible 
(see Table I and IV); median and lower limits are both higher, 
though no individual in 1925 attains as high a score as the best in 
1923. 

In Tables III and IV, the relation of senior and freshman 
groups is presented. Since the best senior scores barely exceed 
the best freshman, the claim of the intelligence test to measure 
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native endowment rather than acquired ability or knowledge would 
seem to be substantiated... Only the information and analogy 
scores of the seniors ascend disproportionately; the latter perhaps 
because of familiarity with similar tests gained in Psychology I. 
The bulk of each senior class ranks in the highest quartile of the 
freshman class subjected to the same edition of the test. This is, 
of course, as it should be, through the natural process of selection 
and elimination of the weaker operating over three college years. 
There remain, to be sure, in the senior classes a number of strag- 
gers in the lower quartiles whose academic status is for the most 
part as unsatisfactory as their intelligence rating. 

In order to check the operation of selection versus training more 
exactly, in January, 1923, when the freshmen tested in 1919 were 
seniors, the number of survivors in each of the original quartiles 
was tabulated (see Table III). The percentage of the survivors 
ranking above the median was found to be only 61 per cent, a 
figure somewhat below that for senior-freshman comparisons when 
two classes are tested simultaneously, the former in their senior 
year (84%). The 25, 50, 75, 100 percentiles for 1923 seniors were 
also found to average only about 5 points higher than those for 
the same group as freshmen, although the discrepancy between 
senior and freshman medians for 1922 and 1925 (tested simul- 
taneously) was 18 points. Granted that the quality of entrants 
has not deteriorated from 1917-1921, individual improvement would 
seem te play a more important role than selection in raising senior 
scores.” Even so, the rise to be credited to senior superiority in 
training—in knowledge, aplomb, concentration, grasp, or what not 
—say 10 or 12 points on the average, forms only a fraction of the 
total range of difference in abilities in the original group (90 to 
100 points). In other words, three years’ college training would 
avail, presumably, to shift the rank of a given individual hardly 
a couple of deciles—a conclusion borne out by the records of re- 
testing in the sophomore year, where the average gain approxi- 
mates a decile. 

- «The fact that scores vary on the average inversely with the age of the 
subject, from 16 to 21, might also be cited as evidence that the test meas 
ures, not amount of training, but native ability. 

‘Although, before proceeding to any such conclusion, the unusual losses in 


the upper test and grade deciles of 1923—largely through transfer to other 
institutions—should be drawn into the calculation. 
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TABLE III 
RATING OF SENIORS WITHIN THE FRESHMAN QUARTIL™S. DISTRIBUTION 
1921 1922 1921 1922 1923 
Fcr Problems Cervect Fcr Adjusted Secre Correct. Adj. Score 
Quartile* Frequency 
IV 17 20 é 11 
III 7 7 7 10 
II 2 4 9 
I 3 1 4 
Total 29 32 29 32 
Above median, 24 30 or 25 or 27 or 23 or 21 or 
or 83 pr ct. 93 pr ct. 86 pr ct. 84 pret. 66 prct. 61 pr ct. 
*Quartile rank in terms of 1924 freshman records for 1921, seniors 
in terms of 1925 freshman records for 1922, seniors 
in terms of 1923 freshman records for 1923, seniors 

LATIN ENTRANCE CREDITS AND INTELLIGENCE RATING. 

In order to estimate the possible detriment to the student body 
of altering the conventional Latin entrance requirement of the 
eastern colleges, both intelligence rating and academic standing 
for the freshman and sophomore years of students whose entrance 
certificates showed respectively 4, 3. 2, 1, or 0 Latin credits were 
tabulated. As shown in Table V, the median intelligence rating 
of those offering the standard number of units (4) is only slightly 
above the rating for those offering fewer. Further, the individuals 
of the two groups are similarly distributed over the ten test 
deciles, and the ratio of students above and below the median is 
practically the same (1 to 1) in each division. Adhering to the 
old 4-point Latin requirement would have cost the college 6 out 
of 15 of the most promising members of 1923; 4 out of 11 of 1924; 
and 2 out of 12 from 1925. If the intelligence test is a reliable 
indicator, the 4 unit Latin requirement is no longer the sieve for 
eliminating the weaker students which it seemed to be ten or 
fifteen years ago, when the brighter minds in the secondary schools 
were urged to complete the four years of Latin in the classical 
course, and failure to do so was a fair indication of either sloth 
or stupidity. This requirement now cuts out the sheep along with 
the goats. 

From the point of view of academic success, at least in the 
sophomore and freshman years, the comparison of the reduced 
credit with the standard group is less favorable. The median 
grade for the former is somewhat lower, the number of students 
above the class average proportionately smaller. At the same 
time, the students of this group range through all the deciles of 
grade rank, and their elimination through a 4-point entrance re- 
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quirement would have cost the class of 1923 5 out of its best 13 
students; 1924, 1 out of 13; 1925, 3 out of 11. 

Assignment of the reasons for the slightly lower academic 
achievement of the Sweet Briar student with reduced Latin en- 
trance would be impossible without a fuller analysis of the con- 
ditions governing admission than is here practicable. There is, 
however, no real evidence that it indicates that the lack of Latin 
handicaps the student in either linguistic or scientific studies.’ It 
may indeed signify that the steadfast pursuit of Latin studies is 
still in a degree a measure of the scholastic ambition and energy 
of the special group of students who apply to Sweet Briar for 
admission. This latter view would, in fact, seem to be supported 
by the falling away of 3-0 point Latin entrants in the sophomore 
year, as compared with 4 point—see Table V, in which the ratio 
of the former to the latter drops from roughly 1-1 to 1-2 for 1923, 
from 3-5 to 1-2 for 1924, in the sophomore years as compared with 
the freshman. This falling off is not, however, to be ascribed 
entirely to lack of scholarly ambition. The inability to obtain an 
A.B. degree on less than the full Latin entrance requirement 
operates either to discourage students from continuing in college, 
or to encourage their transfer to other institutions, e.g., state 
universities, where the lack of Latin is not a handicap. The effort 

TABLE V 
LATIN ENTRANCE CREDITS AND INTELLIGENCE RATING 
FREQUENCY OF CREDIT GROUPS 
Class No. 4 3 2 1 0 Tot. 0-3 Credits Year 
Fresh-1923 149 79 18 32 5 15 70 
man 1924 116 71 14 24 7 45 
1925 116 73 17 22 4 


192: 46 Sopho- 
1924 44 more 
1925 50 


CREDIT UNITS AND INTELLIGENCE RATING 
Median Test Deciles 
1923 6 5 6 
1924 6 5.5 4 
1925 6 5 6 
ACADEMIC RANK 
Median Grade Deciles 
1923 6 3 4 
1924 3 5 4 
1925 7 4 5 3 


*The English grades for the 4 unit Latin matriculates of 1925 actually 
average lower than those for the 2 unit group. 
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to make up the missing Latin credits in college in order to enter 
for the A.B. degree while carrying full freshman work also oper- 
ates to lower their academic status. In any case, it is obvious that 
the effect upon the intelligence level of the student body of lower- 
ing the Latin requirement has not been at all as dolorous as the 
Latin enthusiasts were wont to predict. 

TABLE VI. 

CORRELATION OF TESTS AND GRADES 


Soph. and Jun., 
Fresh- Fresh- Soph. and Jun.and Four 
Year man No. man No. Fresh. No. Senior Years 
1923 43 146 A3% 78 517 3 
1924 42 112 48 65 
1925 .44 113 
1921 4 21 09 A8% 48 


PREDICTIVE VALUE OF THE TEST 

Throughout the above discussion, the value of the Thurstone 
test as an indicator of ability has been more or less assumed. It 
remains to examine its claim more minutely, to determine the exact 
degree of reliability of the test in relation to success or failure 
at Sweet Briar College. With this object in view, the test scores 
have been correlated with instructors’ estimates,’ and with academic 
grades for from 1 to 3 years. The test ratings of students drop- 
ping out and remaining in each class have been compared. A 
number of members of 1923 and 1924 have been re-examined with 
an alternative form of the test. Cases of marked discrepancy be- 
tween scores and academic ranks have been carefully analysed, and 
the health records of each class scrutinized for whatever light 
they may throw upon the problem. 


This method of testing out the scores was abandoned on account of the 
difficuity of securing faculty estimates of intelligence on any other basis 
than that of class marks in the department concerned. The correlation of 
tests and estimates for 1923 was .43%, only % point higher than that for 
tests and grades. 

“Rogers, reporting 182 Goucher freshmen, finds a correlation of .37% for 
the Thorndike tests and freshman grades (see Journal of Educational Psy 
chology, April, 1920, p. 181; Mental Tests as a Means of Selecting and Clas 
fying College Students). The correlations between*grades and Army Alpha 
reported from Yale, Ohio, and Syracuse Universities range from .20 to 41 
Higher coefficients are occasionally reported for special tests 
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Test Scores and Grade Rank. After trying out and discarding 
various metnods of weighting, the following procedure was 
adopted. The letter grades of each student for the year were 
converted into numbers (following the scheme A=11, A—=—10, 
B+=—-—9, ete.) and the total obtained. This total instead of the 
usual average was then made the basis of a grade ranking, and 
correlation coefficients with test rating (based upon adjusted 
score) obtained. These coefficients (R, calculated by the Spear- 
man foot-rule, 1.5) are shown in Table VI. The figure for the 
first year grades is about the same for all classes, +.43, and ap- 
proximates that usually quoted for freshman groups of fairly 
homogeneous material... When sophomore and junior grades are 
added, this figure rises to .517 for the 38 surviving members of 
1923. 

A clearer notion of the degree of correspondence (or divergence) 
between test and grade rankings may be obtained by dividing each 
class into four sections or quartiles, first on the basis of their test 
ratings, second, upon that of their year’s grades. The number of 
students occupying a similar quartile in each set of measures may 
then be computed, also the number occupying ‘he highest quartile 
in one and the lowest-in the other, i. e., the number both of 
coincidences and of absolute disagreements in rank. E. g., as 
shown in Table VII, 22 individuals out of the 35 in the highest 
test quartile for 1923 rank also highest in freshma nscholarship; 
14 out of 28 for 1924; 15 out of 28 for 1925. At the same time 
the range of academic achievement for any given degree of test 
performance is wide, and a few individuals from the lowest test 
quartile in each class attain the highest grade quartile, at least 
in their;freshman year. The percentage of the group maintaining 
the same quartile rank in both sets of measures is 43 for 1923, 
42 for 1924, 39 for 1925.° This means that we would be absolutely 
right in the case of about 4 out of 10 students in predicting their 
quartile rank in scholarship on the basis of their quartile position 
in test rating; absolutely wrong in the case of only 3 or 4 out 
of 100 (these latter discrepancies being reduced or eliminated 
when the ranking is based upon the grades of latter years.) 


I 
» 
o 


‘The corresponding figure given by Rogers is only 35%. Whether this 
indicates the higher reliability of our scores, or the less selected and homo- 
geneous character of our group, is debatable, 
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TABLE VII. 
QUARTILE COINCIDENCE IN TEST AND GRADE RANKINGS 
A. 1923 FRESHMAN GRADES 
IV III II 
IV 22 7 5 
III , 13 7 11 
II 8 12 9 
I , 15 
No. Subjects 37 
Total 144 
Coincidence 62(43%) 
Disagreement 4(2.7%) 
B. 1923 JUNIOR, SOPHOMORE AND FRESHMAN GRADES 
IV Ill II I Total 
IV 5 5 0 0 10 
III 2 1 6 4 13 
II 1 3 3 1 8 
I 1 1 1 4 7 


Total 38 
Coincidence 13(34%)* 
Disagreement 1 


C. 1924 FRESHMAN GRADES 

IV III 

IV 14 8 

TESTS III 5 11 

II 8 4 

I 1 5 

No. Subjects 28 28 
Total 112 
Coincidence 47 (42%) 
Disagreement 3(2.6%) 

D. 1925 FRESHMAN GRADES 

IV III I 

IV 15 7 
TESTS III 5 9 ¢ 6 
II 4 10 8 
I 4 4 13 
No. Subjects 29 
Total 113 
Coincidence 44(39%) 
Disagreement 6(5%) 





the 38 remaining members of the class. 
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E. 1921 FOUR YEARS GRADES 
III 


Total 21 


Coincidence 12 (57% 
Disagreement 2(9.5%) 


For the test ranking of those delinquent in studies at the end 
of the freshman year, also of significance here, see Table XI. For 
a corresponding group Thurstone reports delinquents to the num- 
ber of 0, 2, 3 and 9, for the test quartiles 4 to 1. 

What is the real significance of these figures? Can a test which 
yields a correlation coefficient of less than .80, a percentage 
quartile coincidence with grades of less than 75, be of utility in 
a college office, either in the guidance or elimination of students? 
Should material or method be modified, or is the quest hopeless ? 

In answering this question, the necessity must first of all be 
emphasized of considering the grades for the entire college career 
of the student before passing judgment upon the test as prognostic 
of academic achievement. Grades, always tricky, are specially 
open to attack in the freshman year. First of all, there is the 
fallibility of the instructor, possibly too easily impressed by a glib 
memory, poise of manner, a fluent tongue. Second, the variance in 
ranking standards in different departments, illustrated by the fact 
that the median freshman grade in the various subjects ranges 
from C minus to C plus for 1924, from C plus to B minus for 1925 
(C=Fair, D=Conditioned, B=Good); while the middle point in one 
subject rose from C minus in 1920-21 to B minus in 1921-22. 
Third, owing to a certain latitude of choice in subjects, even in the 
freshman year, deceptively high records may be achieved by 
students doing very little real brain work; especially in the case of 
courses based upon material already familiar to the well-prepared 
student. 

There is also to be considered the routine character of many 
freshman courses, involving on the whole more of a memory grind 
in grammar, formulae, and fact, than later, more advanced, and 
elective studies, which are per se better fitted to evoke real reason- 
ing ability and initiative in the student. Lastly, from the point 
of view of the individual freshman, there is the chaotic character 
of the first year in college, involving as it does adjustments to a 





MURRAY 


novel environment, collision of social and intellectual ambitions, 
as well as submission to a more or less arbitrarily imposed curri- 
culum, in which the special bent of the individual may be ignored 
or thwarted. All these factors conspire to lower the value of 
freshman grades as prognostic of future academic success, and to 
prevent the individual student from finding her true level in the 
freshman year. An easy schedule or single-minded industry may 
run up a record unduly; the reverse factors lower it. 

Even, however, when the upper class records are at hand, the 
indications are that the correlations will hardly rise higher than 
.60 or .70. Is this to be ascribed to the shortcomings of this par- 
ticular test, or is it inevitable in all mental testing of college 
women, representing a comparatively narrow range of abilities 
and subject to certain other peculiar conditions? It is a matter 
of common knowledge that many factors apart from intelligence 
enter into the determination of academic success. Health, ambi- 
tion, habits of industry, social distraction and diversion, worry 
and other causes may tip the scales one way or the other. Is it 
not likely, particularly in the case of a group of women, working 
less under the spur of strenuous vocational motives than do men, 
that the serviceability of any mental test is necessarily limited? 

Analysis of Cases of Marked Discrepancy between Test Score 
and Academic Rank. In order to apportion the blame between 
grades and tests, and determine if possible how far the test itself 
stands in need of alteration (in subject-matter, method of admin- 
istering, or scoring, duration, etc.) a number of cases of marked 
dislocation between academic rating and test scores were selected 
for further study. Twenty-three of these, along with a sprinkling 
of seven others to act as a control—to the number of 30 in all 
for the classes of 1923 and 1924—-were re-examined in March 
1921 with alternative forms of the test blank. Secondly, a number 
of facts were collected and tabulated (see Tables VIII and IX) 
with regard to all freshman in whom the amount of dislocation 


is equivalent to at least % of the total range of rank for the 


\ 


group (representing nearly a quarter of each class.) 
A considerable amount of the divergence between the 
of ratings proceeds undoubtedly from inadequacies 
ratings, the shortcomings of which, especially in relation 
freshman curriculum, have been already commented upon. 
shown by the table, when sophomore grades are substituted fer 
freshman, the discrepancy with test ratings is materially re- 
duced; as a matter of fact, in three quarters of all such cases re- 
maining in college. Cases of apparent overestimation by the test, 
rise on the average 2 deciles in grade rank; cases of underesti- 
mation drop 3 deciles. 
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Nevertheless, certain of the total number of cases in which the 
student’s academic record exceeds the promise of her test per- 
formance, are probably bona fide instances of underestimation by 
the tests. Seven out of ten of such cases when retested rose 
several deciles in rank (although in the majority of control cases 
there was little change); and there is every reason to believe that 
the second testing yields the truer measure of their real abilitv. 
Various familiar strictures as to the fallibility of test data seem 
to be here substantiated—notably the objection that a low score 
may be a function not of native stupidity, but of lack of training, 
of poise, self-control or self-knowledge, manifesting itself as nerv- 
ousness, flirhtiness, poor judgment as to the best balance to strike 
between speed and accuracy, exagzerated confidence or undue 
caution. Further, a scrutiny of the health records indicates that 
a certain number of the cases of discrepancy studied were laboring 
under some more or less temporary physical handicap, such as 
recovery from a recent operation, which invalidated the earlier 
testing. 

In cases of apparent overestimation by the tests, the health factor 
enters also, though in a different fashion. At least half of the 
individuals in 1924 and 1925 whose grades fall seriously 
expectation (15 out of the 30), and practically all whose sophomore 
standing shows no improvement, have poor health records; while, 

= =~," ; . 


on the other hand, the majority of ca: 


n the class of 1925 in 


which the grade rating is surprisingly high have excellent health 
Spey 


records. Thera is undoubtedly a quick nervous type of individual, 
capable ef running up a high score in a half hour test, but lacking 
the physical stamina (and with it the ambition), for the sustained 
effort essential to a brilliant academic career. In such cases, the 
high test score may sometimes be utilized to arouse dormant ambi- 
tion, but the wisdom of so doing is often questionable; and such 
individuals are frequently outdistanced by classmates of mediocre 
ability but superior nervous and physical endurance. 

While the uncertainty and variability both of health factors and 
mental attitude complicate thus the exact prognosis of academic 
success or failure from this or any similar test, certain devices 
are available for increasing the reliability of test findings. Pains 
should be taken to stress the importance of the test results to the 
individual (without creating undue tension); to defer the test 
period, in case of the physically unfit; and to repeat the test (in 
alternative form) a month or more later, at least for those occupy- 
ing the lowest test deciles. Some more exact adjustment with 
regard to the inclusion of accuracy in the test score, or the weight- 
ing of errors, is also desirable. Something might also be gained 
by the supplementary use of a wil test such as Downey’s or a 
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more complete physical examination record. 
TABLE VIII. 
DISCREPANCIES BETWEEN TEST AND GRADE RATINGS 
Class of 1923 1924 1925 


Lower 
Higher 
I ower 


Grade Ranking* 
Number of Subjects 
Median Age 
Health Poor 
Freshman Course Easy 13 
Left Before Sophomore Year 5 
Soph. Grade Rating Nearer Test 11 
Accuracy Rating Nearer Grade 10 
No. Retested 7 
Retested Nearer Grade Decile 4 

Total 38 

*As compared with Test Rating. 
TABLE IX. 
RESULTS OF RETESTING 
Class of 1923 1924 Total 

No. Retested 16 14 30 
Cases Discrepancy between Test and Grade 13 10 23 
Control Cases 3 4 7 
Second Test Ranking Nearer Grade Decile 6 5 11 
Av. Decile Deviation of Freshman Grades from 

Test I Rating 4.9 
Av. Decile Deviation of Freshman Grades from 

Test II Rating 3.6 
Av. Decile Deviation of Sophomore grades from 

Test I Rating 3.8 
Av. Decile Deviation of Sophomore grades from 

Test II Rating 2.9 

It may be added that the factor of age, or rather, that of 
youth, seems also to complicate our results in a fashion not quite 
decipherable. There is a tendency to an inverse relation between 
age and test rank, a direct one between age and grades. The 
precocious mind tests high, but a certain maturity of experience 
and purpose as well as of mentality is apparently desirable in 
order that higher studies shall be assimilated in their full value, 
and studious habits set up. The weight of evidence is against 
lowering the age limit. 

Test and Grade Rank of Withdrawals. 

Another criterion of the academic value of test scores is some- 
times sought in an analysis of the cases of withdrawal. For 
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Sweet Briar students, the distribution of the losses (shown in 
Table X) is fairly uniform over all the test deciles, except in the 
case of 1923, for which class, the mortality in the lowest decile is 
conspicuously greater than that in the highest. Again, the loss of 
students from the upper half of the test ranks is only slightly less 
than that from the lower—roughly as 5 to 6 for 1923, and almost 
1 to 1 for the two later classes. Obviously test ranking cannot be 
used at Sweet Briar to forecast the probability of withdrawal as 
suggested by Thurstone. That this is not, however, to be inter- 
preted as per se a reflection upon the reliability of test ratings is 
indicated by the second half of Table X: the proportion of tota! 
losses for the upper grade deciles practically parallels that for 
the upper test deciles—except in the case of 1925, where the record 
is as yet incomplete. 
TABLE X 

TEST AND GRADE RANK OF WITHDRAWALS 

TEST RANK* 
Year 1923 1924 
(154) *** (116) 
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GRADE 
1923 
(146) 
10th decile ; 3 1 
Ist decile 


Above median 
Below median 


Total of 109 ‘ 71 


*Adiusted score 
**IiIncomplete 
***Students registering at mid-years are included in this table 
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TABLE XI 
QUARTILE TEST RANKING OF DELINQUENT STUDENTS 
(Those in lowest grade decile at end of freshman year) 
Test of 


Test Quartile 1923 1924 1925~- Thurstone 
4 1 1 1 0 
3 3 2 2 2 
2 3 5 2 3 
1 x 3 7 9 
Total 15 11 12 14 

No. Re-registering { 3 4 
Total number in class 149 116 116 150 


The reason for this state of affairs is not far to seek. In 2 
woman’s college, manifold causes other than ability and ambition 
operate to reduce the ranks of re-registrants, among them ill- 
health, poor preparation, unwillingness to conform to college 
discipline, social ambitions, family affairs and finances, desire for 
specialised courses, or a whim for transfer to other institutions. In 
a smaller college such as Sweet Briar, many freshman enter with 
the deliberate intention of remaining only a year or two, and all the 
allurements of culture do not avail to move them from this intent. 
Nevertheless, there remained in June, 1922, of the lowest test 


1 le 


9 1 
aeclie 4 


of the class of 1923,only one student; of the lowest of 1924 
three only—all struggling under heavy handicaps and in the 
lower ranks of the class. In June, 1922, moreover, the honor stu- 
dents were all, with one exception (a student whose low informa- 
tion and arithmetic scores reduced her original record, but who 
ranked considerably higher in retesting with an alternative form 
of the test) drawn from the upper quartiles, if not always the 
upper decile of their respective classes. In January, 1923, there 
remains, out of the lowest decile of the class of 1923, no student; 
of 1924, two only; of 1925, three. 

Lastly, if quartile coincidence in rank is reckoned between the 
academic ratings for the freshman and junior years of the 38 sur- 
viving members of 1923, there is found to be 37% coincidence, and 
5% total discrepancy {i. e. two students, ranking in the lowest 
quartile in their freshman year, rise to the highest in their junior). 
The percentage coincidence for junior grades and tests is 39%, 
total discrepancies lacking. 

Summing up, then, all the evidence, in face of the fact that at 





vest barely a quarter of each entering class survives to graduation 
it seems reasonabie to conclude that the college would benefit 
materially by the lopping off at matriculation of the lowest test 
decile; and that the risk in so doing of losing any real coliege 
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material is slight. Granted the precaution of a second testing, a 
dead iine for admission might be drawn not merely at the 10 but 
even at the 25 percentile—a score of about 72 correct and 73% 
accuracy (in terms of the records of 1923 at Sweet Briar, for the 
1919 edition), or better, at an adjusted score of 44. Where it is 
not deemed feasible to risk the reduction of the freshman class in 
this fashion, it would at least be desirable to encourage by every 
means available the students of the upper quartile, to discourage 
the return at the end of the freshman year of the students of the 
lowest. Judging by the correlations and other data this course 
would in the end prove more satisfactory than merely seeking the 
withdrawal of the student of the lowest grade decile or quart! 

The question of the suitability of the component elements of the 
two editions of the Thurstone test for our particular task, and the 
value of each relative to the different branches of the curriculum, 
is reserved for discussion in another place. 





PREDICTION FORMULAE FOR TEAMS OF APTI- 
TUDE TESTS 


By CLARK L. HULL 
University of Wisconsin 


I 

The rapid advance in the practical application of tests to the 
prediction of aptitudes, promises soon to bring into general use 
scientific prediction formulae. With the passing over of this im- 
portant branch of test technique from the theoretical to the prac- 
tical, comes a re-examination of the properties of the regression 
equation, now generally accepted as the standard prediction form- 
ula. It is the chief object of the present article to point out certain 
practical difficulties associated wtih the use of the regression 
equation for this purpose, and to propose a remedy. 

Suppose the employment department of an industry has two 
jobs for each of which a team of tests has been devised and each 
team provided with an appropriate regression equation. Let it be 
further suprosed that this employment department has been in the 
habit of rating its employees on a simple uniform scale such that 
the worst man in a hundred is scored zero, and the best scored 10, 
with the mean at 5; and that the diagnostic potency of the first 
team of tests (R) is .60 while that of the second is .40. Now if 
such an employment department should begin to use its tests and 
regression equations in the prediction of aptitudes of prospective 
employees and with exactly the same grade of men as previously 
employed, it would find that the highest prdicted score ever ob- 
tained with the first equation would be 8, and the highest with the 
second only 7, instead of the 19 to which it h-d been accus‘omed. 
If the empolyment department had been in the habit of discharg- 
ing all employees known to be below score 2 in aptitude and had 
planned on rejecting all applicants who were indicated by the 
tests as probably below this level, it would find to its surprise that 
the first equation would predict none at all below 2, and the 
second none below 3! The equations would thus indicate no ap- 
plicants as being either very bad or very good. If the employ- 
ment department should desire to reject the lowest 25% of all 
applicants, it would have to select a radically different score from 
the score marking the lowest 25% of its actual employees. Not 
only this but it would have to select a distinctly different score to 
accomplish this with each of the two regression equations. Or, 
to consider the matter from the point of view of the order of 
merit, a predicted score of 6 by the first team of tests corresponds 
to a rank of tenth from the best in a hundred, whereas the same 
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score with the second team corresponds to a rank of third from 
the best in a hundred. 

Now the cause of all this confusion lies in the fact that regres- 
sion equations have an inevitable and usually radical tendency to 
narrow the range of distribution of predicted scores as compared 
with that of the original criterion. This is at once evident from 
an inspection of the basic regression equation,’ 


when it is recalled that r is always less than unity. The operation 
of this tendency in the case of multiple correlation may be shown 
by means of the following miniature problem, which will also 
serve to illustrate certain other principles to be brought forward 
later. The original data of the problem are given in Table I, 
together with the results of the necessary preliminary computa- 
tions for the regression equation. 
TABLE I 

Subj. Criterion TestA TestB Correlations Partial S.D.’s 

(1) (2) (3) 
I Yu +.625 
IT Yus—- +.5625 S. D.s.2s==2.78 
III Yo—+.375 S. D.21s=1.56 
IV S. Disas=3.31 
Vv Ti2.2——+.542 


Mean Ys +.452 
S. D. 
Dropping the smaller decimals, the primary regression equa- 

tion on this criterion is, 

X:—.968x.+.381x; 
where the x’s are the deviations of the individual subjects’ scores 
from the means of the separate arrays. Substituting in this to ob- 
tain an equation in which actual test scores may be used for pur. 
poses of prediction, 

X.—-7=.968 (x:-8 ) +.381(X;-7) 
Clearing and collecting, we obtain, 

X:—.968X.+.381X;-3.41 
where the X’s represent actual scores. This is the form of the 
regression equation ordinarily used for purposes of prediction. 


1Yule, G. U.: Introduction to the Theory of Statistics, p. 171. 
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Substituting in equation (4) the test scores of each subject 
shown in Table I, we obtain by successive application, the follow- 
ing “predictions”: 

Subject Predicted 
score 
I 7.38 
II 3.33 
III 4.13 
IV 10.64 
V 9.52 


Mean 7.00 
S.D. 2.877 


These predicted scores correlate with the criterion to the extent 
of +.719. The mean of the predicted scores is seen to be, as always, 
the same as that of the criterion, 7. It will be noted, however, 
that the S. D., as would be expected from the nature of the re- 
gression equation, is much less than that of the criterion, being 
only 2.877 as against 4. This is what caused the confusion noted 
above. 
II 

As indicated by equation (1) the amount of shrinkage in the 
dispersion of the predicted scores is dependent upon the size of r. 
In cases of multiple correlation such as the present where we 
have a predicted criterion derived from a set of weighted tests, 
the shrinkage in dispersion is dependent upon the size of th cor- 
relation between the predicted and the actual criterion scores. Yule 
has designated this correlation as R*. Thus in the above example, 
if the S. D. of the criterion (which is 4) be multiplied by R (.719), 
the result will be 2.877, which is identical with the S. D. of the 
predicted scores as obtained by actual computation. Knowing this 
relation then, we may write the formula, 

Xi——Ci2.sX2+CDis.2X: 

where 


and all of the other symbols are the same as employed by Yule. 
This formula, while leaving the correlation between the tests and 
the criterion undisturbed, yields a prediction series with exactly the 
same mean and degree of dispersion as the criterion itself, thus 
eliminating the objectionable features of the regression equation 
noted above. It is accordingly proposed for use in vocational and 
employment prognosis. 
~ 8¥ule, op. cit. p. 248. 








PREDICTION FORMULAE 280 


In determining R, it may be recalled that it is not necessary 
actually to make the predictions and then compute the correlation 
between prediction and criterion as above, in order to find R for 
use in formula (5). R may be found very readily by formulae of 
the following type:* 

Rass V 1—( 1-1’) (1—1'19.2) 
Substituting appropriately in equation (6) from the values given 
in Table I, 
Ri.as=V 1-(1-.625*) (1-.452") 
= .719 
This agrees exactly with the r computed from the predicted and 
the original criterion scores in the ordinary manner. 

The operation of the proposed prediction formula (5) may be 
illustrated by means of the example already considered. Substi- 
tuting the value of R in the formula for ec, 

1 
c—-—— 
-719 
=1.391 
Substituting the value of both ¢ and b in formula (5), 
xX: = 1.391 X .968 x. + 1.391 X .381 x; 
ic Set Se he A I Geaihctrn icin anentinhslbinaeevintegesassnponpaaloiih 7 
Equation (7) is analogous to equation (2) above. From here on, 
the procedure is exactly like that with an ordinary regression 
equation. Substituting the values of the means as in equation 
(3) and solving, we have, 
x: — 1.3845 x. + .530 x; — 7.468 
Substituting the test scores of the various subjects in (8) as was 
done in (4), the following predictions are obtained: 
Predicted 
Subject Score 
I 7.529 
II 1.844 
III 3.007 
IV 12.051 
V 10.506 





Mean 6.987 (7, 

S. D. 3.995 (4) 
It will be observed that in this prediction, not only the mean is 
the same as that of the original criterion, but that the standard 
deviation is also the same. The shrinkage in the dispersion of 
the predicted scores caused by the regression equation has been 
corrected by the prediction formula (5), and the standard devia- 


~ 8¥ule, op. cit. p. 248. 
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tion restored to its original size of 4. The correlation is entirely 
undisturbed, standing at .719 exactly as before. 

It is to be noted that the prognosis formula employed above 
is not itself a regression equation. Tne true multiple regression 
equation serves two distinct functions. One function is to give 
the weight that each test score should be multiplied by, that all 
combined shall yield the maximum correlation with the criterion. 
Innumerable such optimal weightings are possible, since all that 
is necessary for this is that a certain optimal proportion shall 
be maintained among the weights given to the various tests. 
The second function of the regression equation is to indicate 
the particular set of optimal weights that, when the test scores 
are multiplied by them and combined, they will yield the most 
probable absolute criterion score obtainable from the data. The 
proposed formula performs the first function but not the second. 

To give the most probable absolute criterion score, the regres- 
sion equation must always reduce more or less extensively the 
dispersion of the predicted scores below that of the criterion 
aimed at. This introduces an element of uncertainty into their 
use for employment purposes, very much like what would happen 
if predictions were to be made on a new and unfamiliar scale. 
Thus with a basic 10-point scale, the maximum possible pre- 
dicted score may vary from 6 or so, up to above 9. For this rea- 
son a predicted score of 6.5, say, has little meaning when applied 
to an individual subject. It may mean that he is probably only 
slightly above the average applicant in aptitude, or it may mean 
that he is probably the one best applicant in a hundred. The 
predicted score, standing alone, gives us no light whatever as 
to which of these two extremes is more nearly approached. Un- 
fortunately this is exactly what the employer wishes to know 
from a team of tests. By restoring all predicted dispersions to 
the size of the standard criterion aimed at, the proposed formula 
eliminates this ambiguity. 

It is also easy to show that, in spite of the fact that the 
regression equation yields the most probable absolute predicted 
score, its ordinary use in employment may lead to unfortunate 
results. Let us revert to the employment situation instanced 
above (Sec. I). Suppose the labor market to be such that it is 
necessary to hire practically all applicants, the two teams of tests 
being used merely to decide in which of the two jobs a given 
applicant probably has the greater aptitude. Let it be further 
supposed that aptitude in the two jobs is positively correlated. 
One result would be that men of fairly equal but low ability on 
both jobs would tend to score higher on the second job, since 
the low R of this team of tests tends to raise all low scores 
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towards the mean, thus tending artificially to shunt the poor 
workers into the second job. On the other hand, subjects having 
relatively equal but high aptitude for the two jobs will tend to 
score lower'on the second team of tests because the low R in 
this case has lowered towards the mean predictions from all high 
scores, thus tending artificially to shunt the good workmen out 
of the second job. As a result of this there will be a tendency 
for the poor workmen to be segregated on the second job and 
for the better workmen to be segregated on the first job. But 
with the proposed prognosis formula, since the range of predic- 
tion would be exactly the same for both teams of tests, the un- 
fortunate tendency noted above could not take place. It would 
still be true, of course, that the predictions made from the team 
of tests with the weaker R would be less accurate than those made 
from that with the stronger R. 


iil 


It will not infrequently happen that the original unit of meas- 
urement of a criterion will be such as would be undesirable for 
use in prediction. For example, the original criterion of the 
efficiency of lathe workers might be a set of careful micrometer 
readings showing the deviations from specifications of certain 
products turned out by the subjects. Obviously such a criterian, 


while probably excellent from the point of view of reliability, 
would hardly be sufficiently intelligible for general ‘use in the 
prediction of aptitudes". 


A special need to make predictions in terms other than those 
of the original criterion measurement will arise where it is de- 
sired to predict the aptitude of a person on a number of different 
vocations for purposes of comparison, with a view of determining 
in which he is most likely to succeed. Such comparative predic- 
tions, it would seem, must become very general if scientific voca- 
tional guidance is to become a reality. To be readily comparable, such 
predictions should all be made in terms of the units of some 
simple uniform scale, e. g. one of ten points. This scale will 
ordinarily not (except by chance) be the same as that of the 
original criterion measurement. For such cases the following 
method has been derived by means of which predictions may be 
made in terms of scales having any desired means and any de- 
sired standard deviations, quite irrespective of what the corre- 





*A rather common form of this difficulty requiring special treatment is 
where, as so often happens, the only available criterion is a set of ranks of 
the subjects in the activity studied. Ranks are not only unsatisfactory 
units for purposes of prediction, but significant regression equations can- 
not be based upon them. For such cases, a method of converting ranks 
into linear units convenient for purposes of prediction and upon which 
regression equations may be based, is described in this Journal Vol. VI. 
Dp. 385 fr. 
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sponding values of the original criterion scales may have been. 

The procedure is first to work out the primary regression 
equation on the basis of the original criterion measurement ex- 
actly as in (2) above: Then if we let o, be the S. D. of the 
original criterion scores, ande,' be the S. D. of the scores to be 
predicted on the desired scale by the prediction formula, we may 
write, 


a’; 


Ro, 
This value of c is substituted in equation (5). Lastly the mean 
of the desired prediction series is used in place of that of the 
criterion series when the x-values are converted into X-values 
as in equation (3) above. 

The concrete working of the method may be illustrated by trans- 
forming equation (5) into one which will predict in terms of a 
10-point scale with the mean at 5 and with a S. D. of 2. Sub- 
stituting the appropriate values in (9). 

2 


-719 x4 
= .698 

Substituting the value of c in equation (5) and solving we have, 

X: = .6729 x. + .2649x; 
Substituting in (10) the values of the means as in (3) above ex- 
cept that here 5 (the desired mean of the new predicted series) 
replaces the 7 of the original criterion series, we have: 

x: — 5 = .6729 (x. — 8) + .2649 (x; - 7) 
Clearing and combining, 

X; == .6729x, + .2649x, - 2.2375 
This is the formula that we have been seeking. 

Lastly by substituting the test scores of Table I in formula 
(11) we obtain the following predictions, this time in terms of the 
desired 10-point scale: 

“Predicted” 
Subject Score 
I 5.265 
II 2.453 
III 3.003 
IV 7.528 
Vv 6.753 


Mean 5.000 
S. D. 2.000 
It will be observed that the prediction yielded the required mean 
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and standard deviation, and that the correlation is still undis- 
turbed. 

The same results as secured above may be obtained quite 
simply by converting the undesirable criterion scores into the 
desired prediction scale before the zero order correlations are 
computed. This may be performed with ease and precision by a 
method described in this Journal, Vol. VI (p. 298 ff.) 

In case it should be desired to predict in terms of a scale 
different from that of the original criterion but by means of a 
true regression equation instead of the prediction formula rec- 
ommended above, the procedure is exactly the same as that just 
described, except that in this case, 

o; 
c= 
o; 





LAPOR TURNOVER AND METAL ALERTNESS 
TEST SCORES 


By A. J. SNow 
NORTHWESTERN UNIVERSITY 


That labor turnover is an ever present problem and a costly one 
to the industrial institutions that harbor it and to the community 
generally is admitted by students. Some industrial managers fully 
realize the waste involved in the maintenance of an unnecessarily 
large labor turnover; if there is any one who doubts it, it is that 
group of employers who do not possess the capacity to move along 
mentally in the realm of industrial relations with the constant in- 
dustrial expansion. It is well known that this larger group, com- 
posed mainly of inefficient and dogmatic employers, is the 
greatest single obstacle in eliminating this form of social waste. 
That this group of men is always willing to assert their superior 
and infallible knowledge in handling employees, is an ever present 
experience of the student in this field, and that they have almost 
completey.failed in dealing with this problem as well as with the 
other problems of industrial relations, we have proof of in the 
increasing industrial unrest. 

Labor turnover is not only an industrial problem, but directly 
a social and community problem, this can be seen when we realize 
that waste in any form adds to the actual cost of production, which 
is paid exentually by the community of consumers. To point out 
only five of the many items which are chargeable to this form of 
preventable waste, from. the point of view of management, we have 
the following: first, expense incident to employing; second, instruc- 
tion to the new employee by the foreman; third, spoiled work be- 
cause of new surroundings, new tools, etc.; the same is true of the 
fourth factor,—breakage of tools and machinery; fifth, decreased 
production due to the temporary low production of new help. From 
the point of view of the ill-placed employee, we can enumerate the 
following three factors: first, loss in wages incident to the change 
of jobs; second, decrease in earnings due to the period of adapta- 
bility to the new job; third, decrease in earnings due to mis- 
placement. 


it has been known to industrial psychologists, some of the more 
enlightened employers and a handful of economists that wages 
and working hours are not the only industrial problems; in fact, 
wages and hours, as an incentive, are subordinate, or, at most, 
on a level with the control and distribution of power within an 
industrial organization, All that one has to do to verify this 
hypothesis is to estimate the number, magnitude and importance 
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of disputes over the distribution of power as compared to that 
of wages and hours in any industrial court of arbitration of long 
standing, such as the industrial court in the clothing industry. 
The same may be said of our present problem—labor turnover. 
The casual observer is certain that the most responsible factor 
for an abnormal labor turnover is the worker’s dissatisfaction 
with his wage. The student of the subject knows, however, that 
dissatisfaction with wages will not alone induce the worker to 
change his job, much less to give it up with the ever-looming 
possibility of unemployment. Most human beings must feel that 
the change will be beneficial in more ways than one. As a rule 
there are a number of factors contributing to the worker’s de- 
cision to leave, and some of these may very well be of the fol- 
lowing nature: abuse of power—perhaps the greatest source of 
dissatisfaction—wages, hours, unsanitary conditions, arbitrary 
advancement policy, lack of instruction, faulty machinery or tools, 
heterogeneous “consciousness of kind” among team workers, 
personal incompatibility with immediate superiors, lack of stan- 
dardized methods of process, lack of instruction, lack of spe- 
cialized training on the part of the worker, economic insecurity, 
physical incapacity, mental incapacity and a mentality too great 
for its task. 


Statistics are unfortunately lacking as regards the comparative 
importance of most of these various factors. However. some 
interesting data has recently been gathered in connection with 
the two last factors. While admitting the danger of any study 
which aims to dissociate any one factor from its immediate sur- 
roundings, nevertheless, such a study may be justified on the 
grounds of its illuminating results and their actual use, when 
other factors composing the complete situation are given due 


weight. Consequently, the statistical study as to the relation of 
mental alertness and labor turnover that is to follow must not 
be given undue prominence in the mind of the student of the 
general and wider problem of labor turnover. 


The Scott Company has found the following tendency in their 
practical application of psychological methods to industrial man- 
agement—that, when all of the men in any organization, excluding 
major and minor executives, are given the Scott Company Mental 
Alertness Test’, the score so obtained tends on the average to 


1Most psychologists recognize, and rightly so, a decisive difference be- 
tween mental alertness and intelligence, mental alertness being only one 
factor in intelligence. See, e. g., Gault, Social Psychology (Holt, 1923), p. 77 
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change with the worker’s occupational group*. This tendency 
would suggest that a worker’s place in an occupation which de- 
mands higher capacity for alertness than he is capable of would 
become either dissatisfied with his abnormally slow progress or 
would be discharged for incompetency. On the other hand, a 
worker of higher alertness capacity than is demanded by his 
present occupation, if not rapidly promoted into another field of 
endeavor where a higher level of alertness is necessary, will 
become dissatisfied and leave for a more promising job. The 
worker falling in this latter class is, of course, very valuable to 
an industrial enterprise and the knowledge of his capacity in 
mental alertness would tend to save him for that industry which 
gives due weight to such information in their promotion and 
training policy. 

The following two experiments made by the Scott Company 
yield illuminating data. The first one was a study of the relation 
of the number of years each man was retarded at the time of 
leaving school—as an index of mental alertness—as related to 
the stability of the working force in several production depart- 
ments of a manufacturing company. In this study there has been 
found a striking difference between departments, e. g., in the tool 
department where the work is of a high grade and varied, “the 
greatest proportion of dissatisfaction occurs among the worl:ers 


who were most retarded in school. The stability increases as the 
amount of retardation lessens. . .” While in the inspection 
department of the same company, where the work is repetitive, 
monotonous and ‘fool-proof’ “the amount of dissatisfaction is 
low for those men who were very retarded in school. The per- 
centage of dissatisfaction increases markedly until, for those 
whose progress in school was normal. . . ™ 


The second study dealt with relationship of mental alertness 
test scores and labor turnover. The psychological test used was 
the Scott Company Mental Alertness Tests.‘ This study has been 


“All of the freshmen at Northwestern University had to take the Scott 
Company Mental Alertness Test. It has been found that after the lapse 
of one semester that all of the freshmen that scored in the upper one-third in 
mental alertness tests 66.3 per cent earned a grade, which if averaged 
throughout their college career, would enable them to graduate. Of the 
freshmen who scored in the lowest one-third in the mental alertness tests, 
only 20 per cent secured the necessary grade for graduation. Freshman 
whose high school scholarship would place them in the fourth quarter, and 
who were admitted to the University on securing the medium or above in 
the mental alertness tests, only 16.6 per cent received the grade at the end 
of the first semester, which if kept up, would enable them to graduate. 
(The data for this note was secured from Mr. Paul Lester Palmer, North- 
western University.) 

3Scott and Hayes Science and Common Sense in Working with Men (Ronald 
Press, 1921), p. 75. For a full account see also pp. 72-28, and the Scott 
Company Laboratory Bulletin L. No. 3, August 30, 1920. This bulletin is not 
published but mimeographed; the same is true of the bulletins referred to 
below. 

‘Printed, not published by the Scott Company, Philadelphia, Pa. 
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made upon the employees of three different companies. For the 
sake of convenience we shall designate the three companies as 
X, Y, and Z. The test was given to the usual routine office clerks 
of company X and Y, the only difference being that women clerks 
were the subjects in company X, while men clerks in company Y. 
In the case of company Z office boys only were examined. In each 
case there has been found to exist a definite relationship between 
the mental alertness of employees, as determined by test scores, 
and labor turnover. However, this relationship is not always the 
same for the three companies. 


For company X, it has been found that about 40 per cent of 
the women clerks hired who have an alertness score between 15 
and 80 leave within six months; this percentage decreasing rapidly 
with the increase in the test scores, until the least labor turnover 
is found among clerks who scored between 35 and 50. For those 
whose scores were above 50, the percentage leaving increases rather 
rapidly until we reach those scoring highest, where the instability is 
the greatest, greater than for the clerks scoring low in the mental 
alertness.” From the-above relationship of mental alertness and 
turnover we may infer that company X does not give due weight 
to the necessity of rapid promotion for its most alert clerks. 


For company Y, as in the above case of company X, the labor 
turnover is high for employees making low scores; instability rises 
with an upward trend of the alertness scores until the highest 
scores are reached, where there is found a pronounced increase 
in instability, a condition unlike that of company X. That differ- 
ence between the two companies appeared, at the time, to be due 
to a peculiar wage situation causing instability among the middle- 
high employees.® That points again to our original contention re- 
garding the cause of high labor turnover, that complete generaliza- 
tion is impossible when the phenomenon is studied in the light of 
only one isolated factor. What can be said, however, is thatitis not 
true that the “make shift” or the “general run” of employees 
is responsible for the high labor turnover, but among other 
factors the incompatibility of the requirements of the job with the 
capacity for mental alertness as manifested by the worker, is an™ 
important contributing cause—and perhaps, at times, the only 
cause for labor turnover. That becomes of further significance 
with the multiplication of our experiments, such as the following 
one, made in company Z. 

In company Z in February, 1920, one hundred and eleven boys 
under eighteen years of age were given the mental alertness tests. 
The results were filed away and no use was made of them until No- 





*Seott Company Laboratory Bulletin L. No. 2. March 25, 2926. 
*Fvid. 
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vember, 1921, when the scores were brought out of the files and com- 
pared with the progress made by the boys during the intervening 
twenty-one months. It has been found that the average score 
for the one hundred and eleven boys was 38.7, while the score 


for the sixty-three boys, still in the company’s employ, was 41.6, 
and the average score of those no longer with the company was 
35.1. That the boys making a high score are most likely to re- 
main with the company becomes evident from the above figures; 
while the boys leaving the company’s employ did so for three 


reasons: first, for better positions—their average score was 44. 
(notice that the score is higher than that of the boys remaining in 
the employ of the company); second, to return to school; third, 
lischarged by the company—their average score was 238... 

If we should compare the salary of the boys on November 1921. 
with the alertness scores we find the following relationship: 
Number of boys. Average alertness score. Average Salary. 

10 57.3 $16.00 
10 23.1 $13.40 

It has also been found that the average score, for the twenty- 
nine boys who were promoted to junior clerkship was 46.2, while 
the score 35.1 represents the average score of the thirty-four boys 
who have received no such promotion. 

It is not only interesting but it is of importance industrially 
to notice that the boys who scored relatively high in the test 
either continued in the employ of company Z or .left for better 
positions. Almost all promotion and, therefore, higher rate of 
pay were won by the boys taken from the same class, even though 
the alertness scores were unknown to anyone save the industrial 
relations department. 

It seems to us that, while the results from the above experiment 
are too scanty to permit any positive generalizations, nevertheless 
they are suggestive of what might be done. Consequently, we may 
venture to make the following suggestions: first, that further 
study in this field might determine the relative alertness scores 
which are most suitable for some‘of the industrial occupations— 
where alertness has been found to be one important factor; second, 
that knowledge so gained will be of practical importance to the 
employment manager; third, that with this knowledge in the hands 
of the employment manager, he will be able not only to reduce 
labor turnover, where previously the instability was mainly due to 
the incompatability of mental alertness with the responsiblity of 
the occupation, but, also, he will be able to detect individuals of 
high alertness capacity whose services to the industry will be in 
their ability to learn rapidly and in that way gain quick promotion, 
ending finally with a job, the demands of which will be propor- 
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tional to the worker’s ability. The third suggestion expresses an 
ideal situation which we may put in the sky, as Plato did with 
his ideal state. 











NOTES AND NEWS 
HEALTH FELLOWSHIPS AND SCHOLARSHIPS 


The American Child Health Association, believing that health 
is a matter of sound education, and that no greater creed can 
be spread through the world than the gospel of good health, has 
instituted an unique series of health scholarships. 

The sum of $10,000 has been allotted to these scholarships and 
fellowships, which are to be given to teachers in service who have 
been especially successful in teaching health. The money will 
be distributed in five $1,000 scholarships. and fellowships, to 
college graduates now in service as principals or elementary 
school supervisors; five $500 scholarships to normal school gradu- 
ates or college graduates now engaged in classroom teaching; and 
fifteen $200 scholarships to holders of teacher’s certificates now 
engaged in classroom teaching. 

Those sums are not hard and fast ones, the Committee announces, 
but approximate, depending upon circumstances. It also announces 
that although the conditions require the teacher to be in service 
at present, an exception to this rule may be made in favor of any 
teacher who has already done exceptional work in health educa- 
tion, even if she is not engaged in teaching at present. 

The purpose of these scholarships and fellowships is to improve 
professional training, to enable teachers to do more effective work 
in health education. 

Fellowship and scholarships shall be awarded to teachers in 
service and to teachers not now in active service but who have had 
exceptionally successful experience in health education, who pre- 
sent evidence of some of the following qualifications: 

a. Success in teaching. 

b. Pleasing and favorable personality. 

ec. Good personal health. 

d. Demonstration of unusual interest in health education. 

e. Possession of initiative, originality, leadership and organiz- 
ing ability. 

f. In addition to the above, a good scientific background is 
considered of special value, in particular, instruction in biology, 
bacteriology, chemistry, nutrition, personal hygene, public health, 
child hygiene, school hygiene, child psychology and principles of 
teaching. 

The qualifications of the Institutions in which fellowships and 
scholarships are to be placed should include the following: 


1. Recognition of standards for the promotion of the personal 
health of their own students. 
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2. A healthful environment in the institution for the setting up 
of desirable standards from the standpoint of sanitation, etc. 

3. A supplementary course of coordinated subjects of instruc- 
tion such as biology, nutrition, etc., in addition to the more specific 
training in health education. 

4. Opportunities for observation, preparation of programs, prac- 
tical teaching in health education. 

5. The providing of training for teachers in other fields than 
the field of health education. 

6. A spirit of co-operation among the various departments 
connected with health education, as evidenced by their success in 
co-ordinating the various elements contributing to health education. 

Fellowships and scholarships providing for a year’s study shall 
go into effect with the beginning of the school year or in the fall 
of 1923. 

Summer School, Chautauqua and Travelling Scholarships will 
go into effect during the summer of 1923. 

The fundamental subject matter in any student’s course should 
be derived from the following fields: Personal Hygine, Nutrition, 
Community Hygiene, Social Hygiene, Mental Hygiene, Health and 
care of infants and young. children and adolescense, First aid and 
safety, Hygiene of the worker, Home nursing and care of the sick, 
School Hygiene, Psysical Education, Principles of Health Educa- 
tion and Practice Teaching. Practice Teaching to include practice 
in all types of contact with children incident to health work in 
the school. 

For information address American Child Health Association, 
370 7th Avenue, New York City. 

BETTER EYE SIGHT AS A NATIONAL ASSET 

The Eye Sight Conservation Council of America represents a 
distinct and organized phase of the general tendency towards 
social, economic and physical betterment. It aims to bring about 
improved vision among the masses of the nation, experience in 
education and industry having shown that a long train of ills, 
involving grave consequences, follow in the wake of defective eye 
sight. 

One of the conspicuous services performed by the Council was 
its participation in the nationwide assay of waste conducted by 
Herbert Hoover as chairman of the Committee on Elimination of 
Waste in Industry of the Federated American Engineering So- 
cieties. This committee, whose field activities were directed by 
Mr. Wallace, found that heavy annual economic and physical losses 
were occurring in American industry through poor vision of the 
workers. All this, for the most part, was characterized as avoid- 
able waste, which the Eye Sight Conservation Council has under- 
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taken to wipe out by a thorough and systematic plan of education. 

Of the 24,000,000 school children in the United States, approxi- 
mately 10,000,000 can justly claim that their country in not giving 
them a square deal, according to Prof. Wood of Columbia. “For,” 
he says, “they are laboring under a grave handicap—eye defects— 
which could be mitigated to a great extent, if not corrected en- 
tirely, by the intelligent supervision of the state and of the teachers 
of the nation, through proper eye tests of all pupils.” Retarda- 
tion is another evil caused in part, at least, by imperfect sight. 

Experiments to determine the true economic value of perfect 
vision will be made. It is proposed to measure the improvement 
in health, increase in quality and quantity of production, advance- 
ment of individual performance, and decrease in losses due to 
waste and accident. Factory, home and school lighting, now a 
subject of scientific research here and abroad, will be studied. 

The Council, through its Field Secretary, Charles F. Southard, 
is prosecuting its work in the schools of many cities. This work 
embraces lectures to parents, teachers and pupils with motion 
picture illustrations. 

Co-operation with educational authorities is the prime purpose 
of the Council. An illustration of this co-operation is found in 
New York City, where the Council participated in the observance 
of Health Day in the schools, when the eyes of more than 1,000,000 
pupils were tested. 

Wherever the Council works, it strives to establish the princi- 
ples of continued supervision of the children, mothers co-operating 
with the teachers in identifying defects, often hidden, and in 
applying corrective measures. 
















































BOOK REVIEWS 


J. W. W. WALLIN. Causative Factors in Mental Inferiority and 
the Prevention of Degeneracy. Miami University, 1922, 48 pp. 

The writer gives a brief account of the factors causing mental 
inferiority. The article is a summary of his study of the personal 
and family histories of 872 clinic cases. The individuals studied 
are public school cases and not institutional or hospital cases, conse- 
quently the conclusions cannot be affected by hereditary cases as is so 
truly applied to institutional cases. After a thorough discussion 
of personal and family history data and with many carefully 
worked out tables, the author has drawn the conclusions that in- 
ebriety occurs more frequently than any other factor. He also 
makes a statement that the same thing applies to immorality. He 
summarizes other cases of mental deficiency, such as head injury, 
cerebro-spinal meningitis, infantile paralysis, epilepsy, etc., but 
finds few cases as a result of them. 

Dr. Wallin gives a valuable summary of opinions among experts 
as to the degree of heredity among feeble-mindedness. His study 
shows that they are quite divergent in their beliefs. It seems 
that the “facts are not yet all in,” and that the causation of 
feeble-mindedness is as complex as its conditions. It seems that 
from study (that) we cannot depend solely upon colonization, 
sterilization and marriage laws for the reduction of feeble-minded- 
ness, but we must also prevent toxication of the parents as that 
is a big factor in injuring the young child. 


Ohio University M. LA VINIA WARNER. 


J. W. W. WALLIN. An Investigation of the Sex Relationship, 
Marriage, Delinquency and Truancy of Children Assigned to 
Special Public School Classes. Miami University, 1922, pp. 34. 

This article is a summary of a final report to the Board of 
Education of the St. Louis Public Schools. After seven years of 
examining children for Special Education, Dr. Wallin concludes 
that the Board of Education would have to make provisions for 
more mentaly defective boys than for mentally defective girls. 
He does not give any reasons for this. He found among the 
Special Class enrollment an average of 2.14 per cent per family and 
a per cent of 3.3 of all the children enrolled in Special Education 
to have a record of delinquency. He makes a valuable discovery 
relating to the criminality and vicousness of the feeble-minded. His 
discoveries are contrary to previous beliefs in this matter. Dr. 
Wallin suggests that the Special Class training given these children 
has corrected this anti-social conduct. The same thing is true 
of truancy. Dr. Wallin found only 4.5 per cent of the total 
enrollment of the children in Special Education to be rated 


as truants. The entire article is well-rounded argument in favor 
of Special Education. 


Dr. Wallin gives a further study of the industrial records of 
children in Public School Special Classes for mental defectives 
and finds that the records both industrial and criminal so far as 
they go are quite satisfactory and point to the value of Special 
Education. However, a study of the occupations entered show 
that the Special Schools have given very little specific training in 
preparation for them and the employment records of these children 
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furnish little evidence as to the type of vocational training which 
should be supplied in the Special Schools. 

The remainder of the article is taken up in a discussion of the 
Legislative Act in the interest of Special Education for the State 
of Missouri. 


Ohio University M. LA VINNA WARNER. 


STRONG AND UHRBROCK. Job Analysis and The Curriculum. 
Personnel Research Series. Williams and Wilkins Company. Balti- 
more, 1923. Pp. 146. Price $1.00 paper, $2.00 cloth. 

TYhere is no doubt that what we need chiefly in our attempt to 
apply psychology to industrial management, is a large mass of 
ordered material collected as a result of an actual study from field 
experiment in problems of industrial management. Be it problems 
of employment, fatigue or that of incentive, the important thing 
is not so much the ideal conditions as the creation of the possibil- 
ity for experiment and in that way learn, in the long run, what 
can be done in order to reach some workable scheme of human re- 
lations in this or that particular industry. Such studies have 
been published by Link, Chapman, Myers Scott Company, Carregie 
institute of Technology and others—to our delight a new study of 
an important industrial problem by Strong and Uhrbrock has 
just appeared. 

Messrs. Strong and Uhrbrock set before themselves the task of 
the actual field study of the functions of the executives 
in the commercial printing industry. This survey “was 
made for the express purpose of inventorying the du- 
ties of executives and discovering where the emphasis should 
be placed in training men for similar positions” (p. 71.) “Not 
only was it worth while to secure more comprehensive information 
upon which to decide differences of opinion between the printing 
department faculty Carnegie Institute of Technology and printing 
exeutives, but also in order to decide differences of opinion be- 
tween the faculty in the printing department and other departments 
in the Carnegie institute” (p. 71.) To supply material from which 
a well balanced and proper curriculm could be developed and or- 
ganized the authors set before themselves the task of making a 
comprehensive study of the various executive jobs within the 
printing industry—the first task was to make a job analysis and 
from that to develop the job specification. These would naturally 
supply the information necessary for the empirical curriculm 
making. 

The present study, however, is more than an attempt to record 
the results of the above stated survey; as a matter of fact it is 
also an attempt to define the meaning of such an experimental 
attempt and to state the technique of obtaining the necessary data 
and result. Consequently we find in the book chapters dealing with 
such subjects as the following: job analysis, job specification, the 
construction of a curriculm based on job analysis, and a bibliog- 
raphy of job analysis material. 

Some “industrial psychologists” are coming to the cdnclusion that 
a job analysis, made for the purpose of deriving a job specifica- 
tion—to be used by the employment and training department and 
by the wage committee—should contain a study of the actual func- 
tions of a job rather than of a time and motion analysis of the 
worker. This latter type of job analysis has its value, but in a 
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different realm, which is very important—in a study of efficient 
working methods. A job analysis is perhaps not the best name 
because of its traditional connotation to designate a study of the 
different functions performed by any given worker. Such study 
may perhaps be best designated by the name “occupational de- 
scription.” This name seems to us to be more descriptive of the 
nature and content of such a study. The authors define a job 
analysis to be “a method of scientifically dissecting a job in order 
to determine the component elements and their influence upon the 
length of learning period of the worker, production and labor 
turnover” (p. 22.) This definitian we fear is too broad: it covers 
more than the type of study made by the authors; it includes, it 
seems to us, even the most minute analysis of the worker’s time 
and motion, and of the method of handling himself, tools and raw 
material. 

A job specification, the authors say, is the records of results 
obtained from the job analysis (p. 23.) “The job analysis is the 
means; the specification is the end” (p. 24.) “Today the most sat- 
isfactory job specification is one that lists the specific duties in out- 
line form and contains a complete analysis of duties, man re- 
quirements, working conditions, essential education and promo- 
tional possibilities” (p. 27.) While in the actual study itself—and 
to our admiration—the authors kept clearly in mind the different 
aims of a job analysis and those of a job specification in their 
theoretical statements, as above quoted, we must confess the dif- 
ference is not clearly stated, if not really confused; for example, 
the statement that a “satisfactory job specification is one that 
lists the specific duties in outline form and contains a complete 
analysis of duties. “Is that not a part of job analysis or are the 
nature and content of a job analysis and of a job’ specification 
the same? Or is the difierence only that of function? 

The content of the book is a valuable contribution to the much 
needed experimental work in personnel problems. Not only the 
authors and the psychologists in the Carnegie Institute of Tech- 
nology but also the editors of the Personnel Research Series should 
be commended for the type of work that they all made possible 
to appear. It is only to be regretted that more of such studies 
have not appeared up-to-date but we look with confidence to the 
future. 


Northwestern University A. J. SNow. 


EpwWIN A. SHAW AND Epwarp A. LINCOLN. A Comparison of 
the Intelligence and Training of School Children in a Massachu- 
setts Town. Harvard Monographs in Education, Number 1. Cam- 
bridge, Mass. The Graduate School of Education, Harvard Uni- 
versity. 1922. 


We have in this monograph a report of two surveys of the 
schools of a small town. Group tests of intelligence and edu- 
cational tests in arithmetic, writing and reading were given in 1921 
and again one year later. The study emphasizes the importance 
of the interpretation of survey results by the teachers and officers 
themselves. It is upon them then that the duty of providing 
remedial measures rests. The reports which are not effective in 
practice in the systems where the data are gathered fail of their 
main purpose. The authors also caution against over-emphasis of 
the formal processes which standard tests now measure; and the 
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opinion is expressed that most standards are too high for practical 
purposes thus causing a sacrifice of the less easily measurable to 
the more easily measurable results. 

WALTER F. DEARBORN, EDWARD A. LINCOLN AND EDWIN A. SHAW. 
Standard Educational Tests in the Elementary Training Schools 
of Missouri. Harvard Monographs in Education, Number 3. Cam- 
bridge, Mass. The Graduate School of Education, Harvard Uni- 
versity. 1922. 

This monograph gives the detailed results, with some comment 
thereon, of the educational tests which were given in the investi- 
gation of teacher-training in Missouri. A general summary of the 
results has already been published in “The Professional Prepar- 
ation of Teachers for American Schools” issued by the Carnegie 
Foundation. It seems strange that in this university monograph 
= “Cape Girardeau” should be almost consistently mis- 
spelled. 


Ohio University H. G. Goon. 


A. B. Fitt. The Human Instincts in Business. The Lothian 
Book Publishing Company, Melbourne and Sydney, 1922. Pp. VIII 
and 100. 


The author of this book does not claim to have made any new 
contribution to our knowledge of instincts; he has attempted only 
to point out in what way our knowledge of human nature—in the 
major part our knowledge of instincts—can be of value in solving 
the problem of industrial relations, and that seems to be mainly a 
problem of incentives. The “book is an outline of some of the 
main features of a series of lectures delivered in 1918 to a body 
of men and women representative of the Melbourne business 
world” (p. VIII). 

Professor Fitt in his attempt to throw some light upon the diffi- 
cult industrial problem shows a pretty thoroughgoing influence of 
Professor McDougal—while his application of Professor Mc- 
Dougal’s general theory of instinct is clear it lacks the brilliancy 
of Professor Watts’ treatment.in his Introduction to the Psycho- 
logical Problems of Industry. One is familiar with such an at- 
tempt i. e., as started in the United States by Professor Parker 
and followed with last mastery by Mr. Tead; in England by Pro- 
fessor Watts. Neither Mr. Tead’s nor Professor Fitt’s books are 
of great value as proposed solutions of the problem of economic 
motives—their worth lies in a different direction. Mr. Tead’s 
vivid examples of industrial maladjustment are useful as a source 
of cases; while Professor Fitt’s application of instincts to the gen- 
eral problem of economic motivés is suggestive. Both, however, 
suffer because of a lack of method of approach to the problem. It 
seems to us that the only method is an empirical one and no other. 
The tendency today is to base too many deductions and generaliza- 
tions of wide economic application on our hypothetical knowledge 
of instincts. It is a pity that we should spend so much of our 
energy in rationalization. If we had knowledge of social behavior 
which was derived from experimentation or observation of a very 
large number of social phenomena then with caution we might 
apply such knowledge to our general problem of economic motives, 
but as the case is our present social psychology is the most hypo- 
thetical branch of the science and therefore such a procedure is at 
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best only suggestive and at times—if taken seriously—dangerous. 
What we might do, however, is to collect a large number of cases 
and from these, if possible, infer certain hypothesis, and further 
by their ‘application—social and economic experiments—come 
closer to a true knowledge of these complex human motives. If not, 
the result of such a set of lectures as these by Professor Fitt will 
produce comments such as the one stated in the introduction by the 
sponsor of the series: “while the hearer could not say that he had 
gained anything that he could directly apply in his business, he 
found that his point of view had been changed” (p. VII). 

Just as Mr. Tead’s Instincts in Industry so Professor Fitt’s book 
The Human Instincts in Business suggests—even though uncon- 
sciously, I presume, because such a point of view is impossible— 
that instincts as motives operate one at a time or if in groups as 
independent units and naturally we find the customary list of 
chapters, of which the following are some examples: “The Gre- 
garious Instinct,” “Suggestion and Imitation,” “Self-display, Pug- 
nacity, Contra-suggestion, Emulation,” “The Collecting, Acquiring, 
or Hoarding Instinct,” “The Instinct of Self-abasement in Relation 
to Self-assertion.” The following will serve as a specific illustra- 
tion of the author’s application of the supposed knowledge of in- 
stincts to a particular problem of industrial management: “Pay- 
ment by piece rate is the method employed in very many factories 
(in some countries industrial legislation will not allow it), and 
there can be no doubt that it gives the most direct approach to the 
original nature of the instinct. For each separate stroke or effort, 


for each unit of work completed, there is a separate unit in pay- 
ment. The reward stands in direct proportion to the work done. 
The various other methods of task-payment increase the pay with 
increases in the amount of work done, but the increase is not di- 
rectly proportional. Still the appeal is greater than that of the 
fixed wage, which does not present the continual draw or stimulus 
to effort.” 


“Again, once the rhythmic activity from effort to reward, re- 
ward to further effort is set up, it can be kept going to almost any 
extent, just as the squirrel or hamster adds to its store long after 
the winter’s needs have been satisfied. Thus, subject to the limits 
of human capacity, the appeal to effort of the task-payment sys- 
tems never ceases” (pp. 64-65). Here the author forgets the great 
dangers of a “piece rate system” upon the general health and effi- 
ciency of the worker; the nervous strain upon the slower employee 
and, what is more, the actual experimental proof that the use of 
wages as an incentive is rather limited. 

It is interesting to note, however, that the British psychologists 
more than the American—in the United States, however, a group 
of brilliant younger economists—have attempted to throw some 
light upon the much vexed problem of economic incentive—a prob- 
lem so disturbing to all students of industrial personnel. 

Northwestern University. A. J. Snow. 

University of Iowa Studies in Psychology, No. VIII, Edited by 
Catl E. Seashore. Psychological Review Co., Princeton, N. J.., 
1922. Pp. IV, 382. Price (7?) 

This collection of monographs represents a somewhat extended 
series of studies in sound localization, the psychology of music, 
and muscular reaction. It will be impossible to review all the 17 
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articles in any thoroughgoing way because of their variability of 
subject matter, and variety of technique, since nine of the articles 
are doctors’ theses. 


The first three articles by Seashore, Halverson, and Stewart, 
respectively, deal with the wave-phase localization of sounds. Sea- 
shore sketches the progress made in the subject previous to the 
World War; Halverson presents a mass of observed data on the 
subject, but sets forth no conclusions; while Stewart, approaching 
the problem from the standpoint of physics, reaches the conclusion 
that the phase effect is not wholly due to varying intensity-ratios 
but that the organs of hearing respond to the phase as such. 

The next eight articles deal with the psychology of music. These 
are more or less condensed doctors’ theses. Bunch investigated 
the acuity of hearing throughout the tonal range. He reviews the 
previous work done in this field, describes the Iowa Pitch Range 
Audiometer which he perfected, and presents data to show wherein 
his instrument permits a much more effective means of detecting 
various types of defects of the internal ear than any instruments 
in previous use. Zuehl’s thesis describes a still further develop- 
ment of the instrument described by Bunch. He also establishes 
norms of auditory acuity based upon ages of 6 to 15, 17 to 41, and 
42 to 73 years, in which some very interesting facts with regard 
to pitch acuity for high tones is shown. Zuehl presents a very 
brief article describing a stroboscopic device for measuring revo- 
lution rates. As a piece of laboratory equipment this apparatus 
meets a difficult situation at reasonable cost. Knock has investi- 
gated a means of training the voice by the aid of the eye. His 
measurements are made with the tonoscope, the standard 256 fork, 
and the Koenig resonator. His subjects sang a standard tone, its 
major third, its fifth, and the octave. His results agree with pre- 
vious work in that the major third, the fifth, and the octave are 
found to be of about equal difficulty in singing. Errors in singing 
tend to a daily similarity. It is found that one can judge the pitch 
and inaccuracies of another voice better than of his own. Gaw 
applies the principle of vocational guidance to schools of music. 
Her data collected from students in music at Northwestern Univer- 
sity shows clearly by graphic representation of individual char- 
acteristics that musical surveys, thoroughly made, are of great 
value in advising students who desire a musical education. Stan- 
ton’s study of inheritance of specific musical capacities presents 
family histories, and a series of musical talent pedigree charts 
which lead her to the conclusion that musical talent is probably 
inherited according to Mendelian laws. Merry’s article describes 
a method of graphically recording all the speech inflections of the 
human voice. A detailed description of the apparatus, including 
an illustration, is given alongside sample charts and graphs. 
Schoen’s article is a study of the unconscious modifications of the 
singer’s voice while attempting to sing a musical melody. Especial 
attention is given the vibrato. He concludes that tones are usually 
attacked at a pitch lower than intended when preceded by a lower 
tone, and that they are released at a pitch higher than intended; 
that tones are rarely sustained their full interval; that two tones 
are seldom sung alike even by the same individual that certain 
vowels are inherently higher than others; and that the emotion 
expressed by the vibrato is made possible because we have so fre- 
quently associated tremor with emotion in past experience. Wick- 
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ham’s article is only a preliminary study in the control of intensity 
of sounds by touch. Agnew presents two articles in which she 
attempts to compare the auditory imagery of musicians, psychol- 
ogists, and children. She uses the questionnaire method and finds 
that musicians generally attribute themselves with an auditory 
imagery which develops with training; Psychologists are more crit- 
ical and believe that their musical images are more kinaesthetic 
than auditory; and children and untrained adults allege a very 
decided auditory imagery of sounds. Her study of great composers 
shows that they invariably possessed a vivid auditory imagery, but 
this was usually supported by strong imagery in other fields. 


The last three articles are studies in motor capacities. The first, 
by Miss Koerth, describes a pursuit apparatus designed to study 
eye hand coordinations. A phonograph disc, with stylus, magnets, 
counter, etc., compose the apparatus. Ream’s article summarizes 
previous work with a tapping test, and describes a new technique 
for this much used experiment. Hensen’s article is a study of 
serial action. His apparatus consists of a commutator attached to 
a typewriter in such a way that any one of four circuits may be 
closed. His subjects were students, army recruits, musicians, and 
stenographers. He finds a decided personal equation in speed, no 
consistent relationship between speed and accuracy, some correla- 
tion with other motor tests, and slight positive correlation with 
other motor tests, and slight positive correlation with intelligence 
tests and ability to read music at sight. 

Ohio University. C. H. Growdon. 
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